Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check cached postings TTL before returning from cache and expose some metrics #10500

Merged
merged 6 commits into from
Jan 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,15 @@
* [ENHANCEMENT] Ingester: Hide tokens in ingester ring status page when ingest storage is enabled #10399
* [ENHANCEMENT] Ingester: add `active_series_additional_custom_trackers` configuration, in addition to the already existing `active_series_custom_trackers`. The `active_series_additional_custom_trackers` configuration allows you to configure additional custom trackers that get merged with `active_series_custom_trackers` at runtime. #10428
* [ENHANCEMENT] Query-frontend: Allow blocking raw http requests with the `blocked_requests` configuration. Requests can be blocked based on their path, method or query parameters #10484
* [ENHANCEMENT] Ingester: Added the following metrics exported by `PostingsForMatchers` cache: #10500
* `cortex_ingester_tsdb_head_postings_for_matchers_cache_hits_total`
* `cortex_ingester_tsdb_head_postings_for_matchers_cache_misses_total`
* `cortex_ingester_tsdb_head_postings_for_matchers_cache_requests_total`
* `cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total`
* `cortex_ingester_tsdb_block_postings_for_matchers_cache_hits_total`
* `cortex_ingester_tsdb_block_postings_for_matchers_cache_misses_total`
* `cortex_ingester_tsdb_block_postings_for_matchers_cache_requests_total`
* `cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total`
* [BUGFIX] Distributor: Use a boolean to track changes while merging the ReplicaDesc components, rather than comparing the objects directly. #10185
* [BUGFIX] Querier: fix timeout responding to query-frontend when response size is very close to `-querier.frontend-client.grpc-max-send-msg-size`. #10154
* [BUGFIX] Query-frontend and querier: show warning/info annotations in some cases where they were missing (if a lazy querier was used). #10277
Expand All @@ -40,6 +49,7 @@
* [BUGFIX] Distributor: return HTTP status 415 Unsupported Media Type instead of 200 Success for Remote Write 2.0 until we support it. #10423
* [BUGFIX] Query-frontend: Add flag `-query-frontend.prom2-range-compat` and corresponding YAML to rewrite queries with ranges that worked in Prometheus 2 but are invalid in Prometheus 3. #10445 #10461 #10502
* [BUGFIX] Distributor: Fix edge case at the HA-tracker with memberlist as KVStore, where when a replica in the KVStore is marked as deleted but not yet removed, it fails to update the KVStore. #10443
* [BUGFIX] Ingester: Fixed a race condition in the `PostingsForMatchers` cache that may have infrequently returned expired cached postings. #10500

### Mixin

Expand Down
2 changes: 2 additions & 0 deletions development/mimir-ingest-storage/config/mimir.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ blocks_storage:
bucket_name: mimir-blocks
tsdb:
dir: /data/ingester
head_postings_for_matchers_cache_force: true
block_postings_for_matchers_cache_force: true

bucket_store:
index_cache:
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ require (
sigs.k8s.io/yaml v1.4.0 // indirect
)

replace github.com/prometheus/prometheus => github.com/grafana/mimir-prometheus v0.0.0-20250116135451-914982745659
replace github.com/prometheus/prometheus => github.com/grafana/mimir-prometheus v0.0.0-20250123075837-0cc2978b5013

// Replace memberlist with our fork which includes some fixes that haven't been
// merged upstream yet:
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -1283,8 +1283,8 @@ github.com/grafana/gomemcache v0.0.0-20241016125027-0a5bcc5aef40 h1:1TeKhyS+pvzO
github.com/grafana/gomemcache v0.0.0-20241016125027-0a5bcc5aef40/go.mod h1:IGRj8oOoxwJbHBYl1+OhS9UjQR0dv6SQOep7HqmtyFU=
github.com/grafana/memberlist v0.3.1-0.20220714140823-09ffed8adbbe h1:yIXAAbLswn7VNWBIvM71O2QsgfgW9fRXZNR0DXe6pDU=
github.com/grafana/memberlist v0.3.1-0.20220714140823-09ffed8adbbe/go.mod h1:MS2lj3INKhZjWNqd3N0m3J+Jxf3DAOnAH9VT3Sh9MUE=
github.com/grafana/mimir-prometheus v0.0.0-20250116135451-914982745659 h1:OfkJoA8D1dg3zMW3kDMkDdbcMBlNqDfCFSZgPcMToOQ=
github.com/grafana/mimir-prometheus v0.0.0-20250116135451-914982745659/go.mod h1:KfyZCeyGxf5gvl6VZbrQsd400nJjGw+ygMEtDVZKIT4=
github.com/grafana/mimir-prometheus v0.0.0-20250123075837-0cc2978b5013 h1:70NFJ8OVRMCPc89vN520cTJd0vo/elnaXoF7q0I6c2M=
github.com/grafana/mimir-prometheus v0.0.0-20250123075837-0cc2978b5013/go.mod h1:KfyZCeyGxf5gvl6VZbrQsd400nJjGw+ygMEtDVZKIT4=
github.com/grafana/opentracing-contrib-go-stdlib v0.0.0-20230509071955-f410e79da956 h1:em1oddjXL8c1tL0iFdtVtPloq2hRPen2MJQKoAWpxu0=
github.com/grafana/opentracing-contrib-go-stdlib v0.0.0-20230509071955-f410e79da956/go.mod h1:qtI1ogk+2JhVPIXVc6q+NHziSmy2W5GbdQZFUHADCBU=
github.com/grafana/prometheus-alertmanager v0.25.1-0.20240930132144-b5e64e81e8d3 h1:6D2gGAwyQBElSrp3E+9lSr7k8gLuP3Aiy20rweLWeBw=
Expand Down
34 changes: 18 additions & 16 deletions pkg/blockbuilder/tsdb.go
Original file line number Diff line number Diff line change
Expand Up @@ -280,22 +280,24 @@ func (b *TSDBBuilder) newTSDB(tenant tsdbTenant) (*userTSDB, error) {
}

db, err := tsdb.Open(udir, util_log.SlogFromGoKit(userLogger), nil, &tsdb.Options{
RetentionDuration: 0,
MinBlockDuration: 2 * time.Hour.Milliseconds(),
MaxBlockDuration: 2 * time.Hour.Milliseconds(),
NoLockfile: true,
StripeSize: b.blocksStorageCfg.TSDB.StripeSize,
HeadChunksWriteBufferSize: b.blocksStorageCfg.TSDB.HeadChunksWriteBufferSize,
HeadChunksWriteQueueSize: b.blocksStorageCfg.TSDB.HeadChunksWriteQueueSize,
WALSegmentSize: -1, // No WAL
BlocksToDelete: func([]*tsdb.Block) map[ulid.ULID]struct{} { return map[ulid.ULID]struct{}{} }, // Always noop
IsolationDisabled: true,
EnableOverlappingCompaction: false, // Always false since Mimir only uploads lvl 1 compacted blocks
OutOfOrderTimeWindow: b.limits.OutOfOrderTimeWindow(userID).Milliseconds(), // The unit must be same as our timestamps.
OutOfOrderCapMax: int64(b.blocksStorageCfg.TSDB.OutOfOrderCapacityMax),
EnableNativeHistograms: b.limits.NativeHistogramsIngestionEnabled(userID),
SecondaryHashFunction: nil, // TODO(codesome): May needed when applying limits. Used to determine the owned series by an ingesters
SeriesLifecycleCallback: udb,
RetentionDuration: 0,
MinBlockDuration: 2 * time.Hour.Milliseconds(),
MaxBlockDuration: 2 * time.Hour.Milliseconds(),
NoLockfile: true,
StripeSize: b.blocksStorageCfg.TSDB.StripeSize,
HeadChunksWriteBufferSize: b.blocksStorageCfg.TSDB.HeadChunksWriteBufferSize,
HeadChunksWriteQueueSize: b.blocksStorageCfg.TSDB.HeadChunksWriteQueueSize,
WALSegmentSize: -1, // No WAL
BlocksToDelete: func([]*tsdb.Block) map[ulid.ULID]struct{} { return map[ulid.ULID]struct{}{} }, // Always noop
IsolationDisabled: true,
EnableOverlappingCompaction: false, // Always false since Mimir only uploads lvl 1 compacted blocks
OutOfOrderTimeWindow: b.limits.OutOfOrderTimeWindow(userID).Milliseconds(), // The unit must be same as our timestamps.
OutOfOrderCapMax: int64(b.blocksStorageCfg.TSDB.OutOfOrderCapacityMax),
EnableNativeHistograms: b.limits.NativeHistogramsIngestionEnabled(userID),
SecondaryHashFunction: nil, // TODO(codesome): May needed when applying limits. Used to determine the owned series by an ingesters
SeriesLifecycleCallback: udb,
HeadPostingsForMatchersCacheMetrics: tsdb.NewPostingsForMatchersCacheMetrics(nil),
BlockPostingsForMatchersCacheMetrics: tsdb.NewPostingsForMatchersCacheMetrics(nil),
}, nil)
if err != nil {
return nil, err
Expand Down
2 changes: 2 additions & 0 deletions pkg/ingester/ingester.go
Original file line number Diff line number Diff line change
Expand Up @@ -2728,10 +2728,12 @@ func (i *Ingester) createTSDB(userID string, walReplayConcurrency int) (*userTSD
HeadPostingsForMatchersCacheMaxItems: i.cfg.BlocksStorageConfig.TSDB.HeadPostingsForMatchersCacheMaxItems,
HeadPostingsForMatchersCacheMaxBytes: i.cfg.BlocksStorageConfig.TSDB.HeadPostingsForMatchersCacheMaxBytes,
HeadPostingsForMatchersCacheForce: i.cfg.BlocksStorageConfig.TSDB.HeadPostingsForMatchersCacheForce,
HeadPostingsForMatchersCacheMetrics: i.tsdbMetrics.headPostingsForMatchersCacheMetrics,
BlockPostingsForMatchersCacheTTL: i.cfg.BlocksStorageConfig.TSDB.BlockPostingsForMatchersCacheTTL,
BlockPostingsForMatchersCacheMaxItems: i.cfg.BlocksStorageConfig.TSDB.BlockPostingsForMatchersCacheMaxItems,
BlockPostingsForMatchersCacheMaxBytes: i.cfg.BlocksStorageConfig.TSDB.BlockPostingsForMatchersCacheMaxBytes,
BlockPostingsForMatchersCacheForce: i.cfg.BlocksStorageConfig.TSDB.BlockPostingsForMatchersCacheForce,
BlockPostingsForMatchersCacheMetrics: i.tsdbMetrics.blockPostingsForMatchersCacheMetrics,
EnableNativeHistograms: i.limits.NativeHistogramsIngestionEnabled(userID),
EnableOOONativeHistograms: i.limits.OOONativeHistogramsIngestionEnabled(userID),
SecondaryHashFunction: secondaryTSDBHashFunctionForUser(userID),
Expand Down
7 changes: 7 additions & 0 deletions pkg/ingester/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
dskit_metrics "github.com/grafana/dskit/metrics"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/prometheus/tsdb"
"go.uber.org/atomic"

util_math "github.com/grafana/mimir/pkg/util/math"
Expand Down Expand Up @@ -522,6 +523,9 @@ type tsdbMetrics struct {
memSeriesCreatedTotal *prometheus.Desc
memSeriesRemovedTotal *prometheus.Desc

headPostingsForMatchersCacheMetrics *tsdb.PostingsForMatchersCacheMetrics
blockPostingsForMatchersCacheMetrics *tsdb.PostingsForMatchersCacheMetrics

regs *dskit_metrics.TenantRegistries
}

Expand Down Expand Up @@ -701,6 +705,9 @@ func newTSDBMetrics(r prometheus.Registerer, logger log.Logger) *tsdbMetrics {
"cortex_ingester_memory_series_removed_total",
"The total number of series that were removed per user.",
[]string{"user"}, nil),

headPostingsForMatchersCacheMetrics: tsdb.NewPostingsForMatchersCacheMetrics(prometheus.WrapRegistererWithPrefix("cortex_ingester_tsdb_head_", r)),
blockPostingsForMatchersCacheMetrics: tsdb.NewPostingsForMatchersCacheMetrics(prometheus.WrapRegistererWithPrefix("cortex_ingester_tsdb_block_", r)),
}

if r != nil {
Expand Down
72 changes: 72 additions & 0 deletions pkg/ingester/metrics_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,42 @@ func TestTSDBMetrics(t *testing.T) {
# HELP cortex_ingester_tsdb_exemplar_exemplars_in_storage Number of TSDB exemplars currently in storage.
# TYPE cortex_ingester_tsdb_exemplar_exemplars_in_storage gauge
cortex_ingester_tsdb_exemplar_exemplars_in_storage 30

# HELP cortex_ingester_tsdb_head_postings_for_matchers_cache_hits_total Total number of postings lists returned from the PostingsForMatchers cache.
# TYPE cortex_ingester_tsdb_head_postings_for_matchers_cache_hits_total counter
cortex_ingester_tsdb_head_postings_for_matchers_cache_hits_total 0

# HELP cortex_ingester_tsdb_head_postings_for_matchers_cache_misses_total Total number of requests to the PostingsForMatchers cache for which there is no valid cached entry. The subsequent result is cached.
# TYPE cortex_ingester_tsdb_head_postings_for_matchers_cache_misses_total counter
cortex_ingester_tsdb_head_postings_for_matchers_cache_misses_total 0

# HELP cortex_ingester_tsdb_head_postings_for_matchers_cache_requests_total Total number of requests to the PostingsForMatchers cache.
# TYPE cortex_ingester_tsdb_head_postings_for_matchers_cache_requests_total counter
cortex_ingester_tsdb_head_postings_for_matchers_cache_requests_total 0

# HELP cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total Total number of requests to the PostingsForMatchers cache that have been skipped the cache. The subsequent result is not cached.
# TYPE cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total counter
cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total{reason="canceled-cached-entry"} 0
cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total{reason="ineligible"} 0
cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total{reason="stale-cached-entry"} 0

# HELP cortex_ingester_tsdb_block_postings_for_matchers_cache_hits_total Total number of postings lists returned from the PostingsForMatchers cache.
# TYPE cortex_ingester_tsdb_block_postings_for_matchers_cache_hits_total counter
cortex_ingester_tsdb_block_postings_for_matchers_cache_hits_total 0

# HELP cortex_ingester_tsdb_block_postings_for_matchers_cache_misses_total Total number of requests to the PostingsForMatchers cache for which there is no valid cached entry. The subsequent result is cached.
# TYPE cortex_ingester_tsdb_block_postings_for_matchers_cache_misses_total counter
cortex_ingester_tsdb_block_postings_for_matchers_cache_misses_total 0

# HELP cortex_ingester_tsdb_block_postings_for_matchers_cache_requests_total Total number of requests to the PostingsForMatchers cache.
# TYPE cortex_ingester_tsdb_block_postings_for_matchers_cache_requests_total counter
cortex_ingester_tsdb_block_postings_for_matchers_cache_requests_total 0

# HELP cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total Total number of requests to the PostingsForMatchers cache that have been skipped the cache. The subsequent result is not cached.
# TYPE cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total counter
cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total{reason="canceled-cached-entry"} 0
cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total{reason="ineligible"} 0
cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total{reason="stale-cached-entry"} 0
`))
require.NoError(t, err)
}
Expand Down Expand Up @@ -457,6 +493,42 @@ func TestTSDBMetricsWithRemoval(t *testing.T) {
# TYPE cortex_ingester_tsdb_out_of_order_samples_appended_total counter
cortex_ingester_tsdb_out_of_order_samples_appended_total{user="user1"} 3
cortex_ingester_tsdb_out_of_order_samples_appended_total{user="user2"} 3

# HELP cortex_ingester_tsdb_head_postings_for_matchers_cache_hits_total Total number of postings lists returned from the PostingsForMatchers cache.
# TYPE cortex_ingester_tsdb_head_postings_for_matchers_cache_hits_total counter
cortex_ingester_tsdb_head_postings_for_matchers_cache_hits_total 0

# HELP cortex_ingester_tsdb_head_postings_for_matchers_cache_misses_total Total number of requests to the PostingsForMatchers cache for which there is no valid cached entry. The subsequent result is cached.
# TYPE cortex_ingester_tsdb_head_postings_for_matchers_cache_misses_total counter
cortex_ingester_tsdb_head_postings_for_matchers_cache_misses_total 0

# HELP cortex_ingester_tsdb_head_postings_for_matchers_cache_requests_total Total number of requests to the PostingsForMatchers cache.
# TYPE cortex_ingester_tsdb_head_postings_for_matchers_cache_requests_total counter
cortex_ingester_tsdb_head_postings_for_matchers_cache_requests_total 0

# HELP cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total Total number of requests to the PostingsForMatchers cache that have been skipped the cache. The subsequent result is not cached.
# TYPE cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total counter
cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total{reason="canceled-cached-entry"} 0
cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total{reason="ineligible"} 0
cortex_ingester_tsdb_head_postings_for_matchers_cache_skips_total{reason="stale-cached-entry"} 0

# HELP cortex_ingester_tsdb_block_postings_for_matchers_cache_hits_total Total number of postings lists returned from the PostingsForMatchers cache.
# TYPE cortex_ingester_tsdb_block_postings_for_matchers_cache_hits_total counter
cortex_ingester_tsdb_block_postings_for_matchers_cache_hits_total 0

# HELP cortex_ingester_tsdb_block_postings_for_matchers_cache_misses_total Total number of requests to the PostingsForMatchers cache for which there is no valid cached entry. The subsequent result is cached.
# TYPE cortex_ingester_tsdb_block_postings_for_matchers_cache_misses_total counter
cortex_ingester_tsdb_block_postings_for_matchers_cache_misses_total 0

# HELP cortex_ingester_tsdb_block_postings_for_matchers_cache_requests_total Total number of requests to the PostingsForMatchers cache.
# TYPE cortex_ingester_tsdb_block_postings_for_matchers_cache_requests_total counter
cortex_ingester_tsdb_block_postings_for_matchers_cache_requests_total 0

# HELP cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total Total number of requests to the PostingsForMatchers cache that have been skipped the cache. The subsequent result is not cached.
# TYPE cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total counter
cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total{reason="canceled-cached-entry"} 0
cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total{reason="ineligible"} 0
cortex_ingester_tsdb_block_postings_for_matchers_cache_skips_total{reason="stale-cached-entry"} 0
`))
require.NoError(t, err)
}
Expand Down
6 changes: 3 additions & 3 deletions vendor/github.com/prometheus/prometheus/tsdb/block.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading