Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[storage] Remove distinction between primary and archive storage interfaces #6567

Merged
merged 82 commits into from
Jan 25, 2025

Conversation

mahadzaryab1
Copy link
Collaborator

@mahadzaryab1 mahadzaryab1 commented Jan 18, 2025

Which problem is this PR solving?

Description of the changes

  • This PR completely removes the interface storage.ArchiveFactory by refactoring all the storage implementations to remove the distinction between a primary and archive interface. Note that the concept of archive storage remains the same within Jaeger, we just now use the same interface to handle both primary and archive storages.
  • 🛑 Breaking change for users of GRPC storage 🛑
    • The GRPC storage was changed to only dispatch to the primary storage instead of being able to dispatch to a primary and archive storage.
    • Mitigation for jaeger-v1 users: In order to restore your archive storage, configure a new GRPC server on a different port and specify it via --grpc-storage-archive.server. Archive storage will also need to be enabled via --grpc-storage-archive.enabled=true
    • Mitigation for jaeger-v2 users: In order to restore your archive storage, configured a new GRPC server on a different port and specify it via a new storage backend in jaeger_storage.backends (an example of this can be viewed at https://github.com/jaegertracing/jaeger/blob/main/cmd/jaeger/config-remote-storage.yaml)

How was this change tested?

gRPC Storage

On main (establish ground truth)

Start the remote storage binary (uses memory storage by default which implements the ArchiveFactory interface)

go run ./cmd/remote-storage

Start the all in one binary configured with grpc storage (jaeger-v1)

SPAN_STORAGE_TYPE=grpc go run ./cmd/all-in-one --grpc-storage.server=http://localhost:17271

Traces can be archived from the UI
Screenshot 2025-01-23 at 10 17 32 PM

For jaeger-v2, change the extension section of cmd/jaeger/internal/all-in-one.yaml to be the following

  jaeger_query:
    storage:
      traces: some_storage
      traces_archive: some_storage

  jaeger_storage:
    backends:
      some_storage:
        grpc:
          endpoint: localhost:17271
          tls:
            insecure: true

and then start the binary as follows:

go run ./cmd/jaeger/

For current PR

Stop both binaries and checkout this PR

gh pr checkout 6567

Start two remote storage binaries (in two separate terminals)

go run ./cmd/remote-storage --admin.http.host-port=:17270 --grpc.host-port=:17271 
go run ./cmd/remote-storage --admin.http.host-port=:17272 --grpc.host-port=:17273

Start the all-in-one binary with explicit archive configurations

SPAN_STORAGE_TYPE=grpc go run ./cmd/all-in-one --grpc-storage.server=http://localhost:17271 --grpc-storage-archive.enabled=true --grpc-storage-archive.server=http://localhost:17273

Traces can be once again archived from the UI
Screenshot 2025-01-23 at 10 25 29 PM

For jaeger-v2, the configuration was changed to the following:

extensions:
  jaeger_query:
    storage:
      traces: some_storage
      traces_archive: another_storage

  jaeger_storage:
    backends:
      some_storage:
        grpc:
          endpoint: localhost:17271
          tls:
            insecure: true
      another_storage:
        grpc:
          endpoint: localhost:17273
          tls:
            insecure: true

Try running all-in-one without archive-storage enabled

SPAN_STORAGE_TYPE=grpc go run ./cmd/all-in-one --grpc-storage.server=http://localhost:17271

We cannot archive traces
Screenshot 2025-01-23 at 10 26 53 PM

CLI Flags

ElasticSearch CLI Flags

git checkout main
SPAN_STORAGE_TYPE=elasticsearch go run ./cmd/collector help > es_main
git checkout v1-es-archive
SPAN_STORAGE_TYPE=elasticsearch go run ./cmd/collector help > es_current

The diff can be viewed here. There is no difference.

Cassandra CLI Flags

git checkout main
SPAN_STORAGE_TYPE=cassandra go run ./cmd/collector help > cassandra_main
git checkout v1-es-archive
SPAN_STORAGE_TYPE=cassandra go run ./cmd/collector help > cassandra_current

The diff can be viewed here. There are a few here differences here in which cassandra-archive.* gains some new configuration options that were previously only existed for the primary storage. cassandra-archive.* also gains some defaults that will be the same as the the primary storage.

Checklist

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Copy link

codecov bot commented Jan 18, 2025

Codecov Report

Attention: Patch coverage is 98.79032% with 3 lines in your changes missing coverage. Please review.

Project coverage is 96.05%. Comparing base (aece007) to head (ba5fdd4).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
plugin/storage/factory.go 97.33% 2 Missing ⚠️
plugin/storage/es/factory.go 98.14% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6567      +/-   ##
==========================================
- Coverage   96.10%   96.05%   -0.05%     
==========================================
  Files         366      364       -2     
  Lines       20949    20747     -202     
==========================================
- Hits        20133    19929     -204     
- Misses        620      622       +2     
  Partials      196      196              
Flag Coverage Δ
badger_v1 9.85% <0.00%> (+0.27%) ⬆️
badger_v2 1.80% <0.00%> (+0.05%) ⬆️
cassandra-4.x-v1-manual 15.00% <16.87%> (-0.29%) ⬇️
cassandra-4.x-v2-auto 1.79% <0.00%> (+0.05%) ⬆️
cassandra-4.x-v2-manual 1.79% <0.00%> (+0.05%) ⬆️
cassandra-5.x-v1-manual 15.00% <16.87%> (-0.29%) ⬇️
cassandra-5.x-v2-auto 1.79% <0.00%> (+0.05%) ⬆️
cassandra-5.x-v2-manual 1.79% <0.00%> (+0.05%) ⬆️
elasticsearch-6.x-v1 19.21% <16.03%> (-0.22%) ⬇️
elasticsearch-7.x-v1 19.29% <16.03%> (-0.22%) ⬇️
elasticsearch-8.x-v1 19.46% <16.03%> (-0.22%) ⬇️
elasticsearch-8.x-v2 1.80% <0.00%> (+0.05%) ⬆️
grpc_v1 11.15% <34.59%> (+0.13%) ⬆️
grpc_v2 8.01% <16.80%> (+0.23%) ⬆️
kafka-3.x-v1 10.14% <0.00%> (+0.28%) ⬆️
kafka-3.x-v2 1.80% <0.00%> (+0.05%) ⬆️
memory_v2 1.80% <0.00%> (+0.05%) ⬆️
opensearch-1.x-v1 19.34% <16.03%> (-0.22%) ⬇️
opensearch-2.x-v1 19.34% <16.03%> (-0.22%) ⬇️
opensearch-2.x-v2 1.80% <0.00%> (+0.05%) ⬆️
tailsampling-processor 0.48% <0.00%> (+0.01%) ⬆️
unittests 94.84% <93.14%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mahadzaryab1 mahadzaryab1 added changelog:breaking-change Change that is breaking public APIs or established behavior and removed changelog:bugfix-or-minor-feature labels Jan 23, 2025
Signed-off-by: Mahad Zaryab <[email protected]>
for t := range uniqueTypes {
ff, err := f.getFactoryOfType(t)
if err != nil {
return nil, err
}
f.factories[t] = ff

af, err := f.getArchiveFactoryOfType(t)
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should error be logged if not nil?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or if it's not an error then I would return bool


// Inheritable is an interface that can be implement by some storage implementations
// to provide a way to inherit configuration settings from another factory.
type Inheritable interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would move these to plugin/storage, they are not directly related to "configurable", which is factory/storage-agnostic

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
@mahadzaryab1 mahadzaryab1 enabled auto-merge (squash) January 25, 2025 19:00
@mahadzaryab1 mahadzaryab1 merged commit d7ab0f8 into jaegertracing:main Jan 25, 2025
55 checks passed
@mahadzaryab1 mahadzaryab1 deleted the v1-es-archive branch January 25, 2025 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage changelog:breaking-change Change that is breaking public APIs or established behavior enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants