Skip to content
This repository has been archived by the owner on Feb 15, 2022. It is now read-only.

Added trace-analytics schema documentation. #612

Merged
merged 2 commits into from
May 26, 2021

Conversation

wrijeff
Copy link
Contributor

@wrijeff wrijeff commented May 19, 2021

Signed-off-by: Jeff Wright [email protected]

Issue #, if available: #586

Description of changes:

  • Added schema docs
  • Added versioning guidelines

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

```

## Fields
Many fields are either copied or derived from the [trace specification protobuff](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto) format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: protobuf


The following compatibility promises are made *only for schemas of the same major version*.

* ***Backwards compatibility*** - features built on version 1.x of the schema **will not break, but may degrade** if data from a **prior** 1.x schema version is used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest specify features as Trace Analytics UI features to avoid confusion, or just reader features according to the context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter question: which part is ambiguous, or how did you interpret "features" as? I had assumed that the document context was enough but maybe it wasn't.

Copy link
Contributor

@chenqi0805 chenqi0805 May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon my first read, I was inferring what "features" imply, Data-prepper features or UI features or both.


Due to the potentially disjointed release schedules of both OpenSearch and the managed offering, we need to ensure that rolling out a major version change is carefully planned.

A typical migration plan will first make Data Prepper artifacts available so that users can start ingesting their data to the new index. To prevent data loss during the migration period, users can be encouraged to simultaneously write to both the old and new indexes. This can be done by either running both old and new versions of Data Prepper side-by-side, or perhaps Data Prepper itself can be updated to write to dual indexes (TODO). Once users have the ability to write to the new index, Trace Analytics plugin updates will be made available which make use of the new major version index.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or perhaps Data Prepper itself can be updated to write to dual indexes (TODO)

Suggest take down this part until there is clarity on what Data Prepper can do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent data loss during the migration period, users can be encouraged to simultaneously write to both the old and new indexes.

QUES: not understand this part well. Why there is a data loss and how simultaneously write to both indexes resolve it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent data loss To minimize analytics downtime. If a user doesn't care to reindex old data, then they'll simply upgrade from Dashboard v1 to Dashboard v2. Dual writing to both v1 and v2 indexes allows for new data to be viewed both before and after the upgrade.

Will clean this section up a bit.


A typical migration plan will first make Data Prepper artifacts available so that users can start ingesting their data to the new index. To prevent data loss during the migration period, users can be encouraged to simultaneously write to both the old and new indexes. This can be done by either running both old and new versions of Data Prepper side-by-side, or perhaps Data Prepper itself can be updated to write to dual indexes (TODO). Once users have the ability to write to the new index, Trace Analytics plugin updates will be made available which make use of the new major version index.

The steps to handle a schema major version update are to:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual migration from old index to new index has not been mentioned in the steps. Expect it should happen before update the TA plugin (step2)? Otherwise user will not be able to visualize the old data.

* This must *always be done first*. Plugin changes cannot go out before Data Prepper artifacts are made available.
* Encourage users to start using the new Data Prepper version ahead of the plugin release. These will allow the user to upgrade their Trace Analytics plugin and immediately have new major version data to work with.
2. Update the Trace Analytics plugin to read from the new indexes. Increment the plugin version and release to OpenSearch and/or the managed offering.
* Communicate to users the need to use the new version of Data Prepper after upgrading their TA plugin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean user should never return to old version of data prepper after upgrading TA? I would make it more explicit in description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBD - I'd expect such instructions or requirements to be in the release notes or announcement blog posts, not here. I might even cut the migration strategies from this doc as they're not strictly related to schema versioning 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Makes sense

Signed-off-by: Jeff Wright <[email protected]>
@chenqi0805 chenqi0805 self-requested a review May 24, 2021 15:18
Copy link
Contributor

@chenqi0805 chenqi0805 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Upgrading the Trace Analytics plugin from v1 to v2 will result in no usability downtime, as both plugin versions will have populated indexes to read from. This approach is more complex in that it requires additional Data Prepper instances and results in duplicate data being written (until the old DP instances are shut down).

#### Mitigating data loss
Upgrading Data Prepper and the Trace Analytics plugin to a new major version schema will result in data written to old indexes being unusable. If users wish to avoid this data loss, the [reindexing APIs](https://opendistro.github.io/for-elasticsearch-docs/docs/elasticsearch/reindex-data/) must be used. Additionally, transforms might required depending on the differences between the two major schema versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: might required -> are required.


Due to the potentially disjointed release schedules of both OpenSearch and the managed offering, we need to ensure that rolling out a major version change is carefully planned.
The following procedures may be considered while performing a major upgrade across both Data Prepper and the Trace Analytics plugin.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@wrijeff wrijeff merged commit 634a426 into opendistro-for-elasticsearch:main May 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants