Emit disk spill and merge buffer utilisation metrics for GroupBy queries #17360

findingrish · 2024-10-16T05:27:52Z

This change is to emit following metrics as part of GroupByStatsMonitor monitor,
mergeBuffer/used -> Number of merge buffers used.
mergeBuffer/acquisitionTimeNs -> Total time required to acquire merge buffer.
mergeBuffer/acquisition -> Number of queries that acquired a batch of merge buffers.
groupBy/spilledQueries -> Number of queries that spilled onto the disk.
groupBy/spilledBytes-> Spilled bytes on the disk.
groupBy/mergeDictionarySize -> Size of the merging dictionary.

Testing

Added UT to verify that GroupByStatsMonitor is reporting the new metrics.
Added UT for GroupByStatsProvider.
Tested metric emission in a local druid cluster for a group by query.

{"feed":"metrics","metric":"mergeBuffer/usedCount","service":"druid/historical","host":"localhost:8083","version":"32.0.0-SNAPSHOT","value":0,"timestamp":"2024-10-24T13:09:48.884Z"}
{"feed":"metrics","metric":"mergeBuffer/acquisitionCount","service":"druid/historical","host":"localhost:8083","version":"32.0.0-SNAPSHOT","value":1,"timestamp":"2024-10-24T13:09:48.884Z"} 
{"feed":"metrics","metric":"mergeBuffer/acquisitionTimeNs","service":"druid/historical","host":"localhost:8083","version":"32.0.0-SNAPSHOT","value":67541,"timestamp":"2024-10-24T13:09:48.884Z"} 
{"feed":"metrics","metric":"groupBy/spilledQueries","service":"druid/historical","host":"localhost:8083","version":"32.0.0-SNAPSHOT","value":1,"timestamp":"2024-10-24T13:09:48.884Z"}
{"feed":"metrics","metric":"groupBy/spilledBytes","service":"druid/historical","host":"localhost:8083","version":"32.0.0-SNAPSHOT","value":367977,"timestamp":"2024-10-24T13:09:48.884Z"}
{"feed":"metrics","metric":"groupBy/mergeDictionarySize","service":"druid/historical","host":"localhost:8083","version":"32.0.0-SNAPSHOT","value":296,"timestamp":"2024-10-24T13:09:48.884Z"}
{"feed":"metrics","metric":"mergeBuffer/pendingRequests","service":"druid/historical","host":"localhost:8083","version":"32.0.0-SNAPSHOT","value":0,"timestamp":"2024-10-24T13:09:48.883Z"}

{"feed":"metrics","metric":"mergeBuffer/pendingRequests","service":"druid/broker","host":"localhost:8082","version":"32.0.0-SNAPSHOT","value":0,"timestamp":"2024-10-24T13:08:49.711Z"}
{"feed":"metrics","metric":"mergeBuffer/usedCount","service":"druid/broker","host":"localhost:8082","version":"32.0.0-SNAPSHOT","value":0,"timestamp":"2024-10-24T13:08:49.711Z"}
{"feed":"metrics","metric":"mergeBuffer/acquisitionCount","service":"druid/broker","host":"localhost:8082","version":"32.0.0-SNAPSHOT","value":1,"timestamp":"2024-10-24T13:08:49.711Z"}
{"feed":"metrics","metric":"mergeBuffer/acquisitionTimeNs","service":"druid/broker","host":"localhost:8082","version":"32.0.0-SNAPSHOT","value":2545250,"timestamp":"2024-10-24T13:08:49.711Z"}
{"feed":"metrics","metric":"groupBy/mergeDictionarySize","service":"druid/broker","host":"localhost:8082","version":"32.0.0-SNAPSHOT","value":20190,"timestamp":"2024-10-24T13:08:49.712Z"}

Release Notes

Going forward GroupByStatsMonitor monitor should be used for emitting mergeBuffer/pendingRequests metric instead of QueryCountStatsMonitor. In a subsequent release this metric will not be emitted from QueryCountStatsMonitor.

LakshSingla · 2024-10-16T06:25:42Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

+    temporaryStorages.add(temporaryStorage);
+  }
+
+  public long getSpilledBytes()


What if the registered "limited temporary storage" is not closed between two invocations of the getSpilledBytes(). Would the code double-count the stored bytes? Perhaps that is the intended behavior. If that's the case, we should figure out the intent behind the metric, and take a call on whether we want to do that or not.

Regardless of this, I think that there should be metric (either this one after repurposing or a new one) which indicates the total size of the spilled data, per query. This will allow the admins to estimate whether the queries need a larger merge buffer and by how much. WDYT?

Would the code double-count the stored bytes? Perhaps that is the intended behavior.

Yes, that is the intention. It basically is reporting the amount of bytes spilled when the metric is emitted.

Regardless of this, I think that there should be metric (either this one after repurposing or a new one) which indicates the total size of the spilled data, per query. This will allow the admins to estimate whether the queries need a larger merge buffer and by how much.

Makes sense, but this would be reported at the end of every query?

Yes, the query can update the monitor at the end

LakshSingla · 2024-10-16T21:08:54Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

+
+  private final ConcurrentLinkedQueue<LimitedTemporaryStorage> temporaryStorages;
+
+  @Inject


nit: We shouldn't need this

LakshSingla · 2024-10-16T21:09:27Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

+  private final AtomicLong resourceAcquisitionTimeNs = new AtomicLong(0);
+  private final AtomicLong resourceAcquisitionCount = new AtomicLong(0);


Suggested change

private final AtomicLong resourceAcquisitionTimeNs = new AtomicLong(0);

private final AtomicLong resourceAcquisitionCount = new AtomicLong(0);

private final AtomicLong mergeBufferAcquisitionTimeNs = new AtomicLong(0);

private final AtomicLong mergeBufferAcquisitionCount = new AtomicLong(0);

Making them atomic seems redundant given that both need to be updated together, and guaranteeing atomicity over a single variable is not useful. The current code does not rely on the AtomicLong part of the variables since they are already guarded by the provider's lock.

LakshSingla · 2024-10-16T21:15:52Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

+    resourceAcquisitionCount.incrementAndGet();
+  }
+
+  public synchronized long getAndResetGroupByResourceAcquisitionStats()


This doesn't seem correct. We are reporting the average as a metric. A better way would be to report the sum, as well as the count. This will allow a better weighted-average than directly reporting the average from the monitor. While we are amortizing a lot of metrics by emitting it from the monitor, I think it's better to report count and sum separately.

I couldn't find a way wherein I could report the sum and count together (so that any metric aggregation platform could automatically compute the averages).
Are you suggesting to report two different metrics groupBy/acquisitionTime & groupBy/acquisitionCount?

LakshSingla · 2024-10-16T21:23:01Z

docs/configuration/index.md

@@ -401,7 +401,7 @@ Metric monitoring is an essential part of Druid operations. The following monito
 |`org.apache.druid.java.util.metrics.CgroupV2MemoryMonitor`| **EXPERIMENTAL** Reports memory usage from `memory.current` and `memory.max` files. Only applicable to `cgroupv2`.|
 |`org.apache.druid.server.metrics.HistoricalMetricsMonitor`|Reports statistics on Historical services. Available only on Historical services.|
 |`org.apache.druid.server.metrics.SegmentStatsMonitor` | **EXPERIMENTAL** Reports statistics about segments on Historical services. Available only on Historical services. Not to be used when lazy loading is configured.|
-|`org.apache.druid.server.metrics.QueryCountStatsMonitor`|Reports how many queries have been successful/failed/interrupted.|
+|`org.apache.druid.server.metrics.QueryCountStatsMonitor`|Reports how many queries have been successful/failed/interrupted. It also reports stats for group by queries.|


Suggested change

|`org.apache.druid.server.metrics.QueryCountStatsMonitor`|Reports how many queries have been successful/failed/interrupted. It also reports stats for group by queries.|

|`org.apache.druid.server.metrics.QueryCountStatsMonitor`|Reports how many queries have been successful/failed/interrupted. It also reports statistics for the group by queries.|

Should it be made as a separate monitor? I don't see any downside of merging the QueryCountStatsMonitor with the group by statistics, but at the same time, I don't think there's any benefit to it.
However, merging group-by statistics with "query count stats" feels incongruent to me.

I don't have strong opinions either, the only reason I merged them was an existing mergeBuffer metric in this monitor. I think a separate monitor would be better due to the added flexibility.

LakshSingla · 2024-10-16T21:27:31Z

processing/src/main/java/org/apache/druid/collections/BlockingPool.java

+  /**
+   * @return number of used buffers from the pool
+   */
+  long getUsedBufferCount();


BlockingPool is generic. It shouldn't reference buffers.

Suggested change

long getUsedBufferCount();

long getUsedResourcesCount();

LakshSingla · 2024-10-16T21:28:14Z

processing/src/main/java/org/apache/druid/collections/DefaultBlockingPool.java

@@ -119,7 +119,7 @@ public List<ReferenceCountingResourceHolder<T>> takeBatch(final int elementNum)
      throw new RuntimeException(e);
    }
    finally {
-      pendingRequests.incrementAndGet();
+      pendingRequests.decrementAndGet();


nice catch 💯

LakshSingla · 2024-10-16T21:36:00Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

+    return average;
+  }
+
+  public void registerTemporaryStorage(LimitedTemporaryStorage temporaryStorage)


How often does this method need to be called per group by query, and what does this rely upon? I feel that there's an inbuilt expense with tracking the spilled bytes this way. I think it is still fine, but I wonder if a more efficient way would be as follows:

Have a running counter of the size occupied

LimitedTemporaryStorage implements closeable. Modify wherever the LimitedTemporaryStorage is getting closed to subtract the value of the storage from the counter. There can be a set inclusion check where a hashset of the opened temporary storages being tracked by the running counter are stored, and the code confirms that the removed temporary storage was indeed being tracked by the counter.

Note: My approach seems more convoluted, so if there isn't much performance downside of the current version, I think its fine as is.

there's an inbuilt expense with tracking the spilled bytes this way

What is the expense you are referring to? The temporary storage object is added to a queue and cost of adding an element should be in most cases should be O(1).

LimitedTemporaryStorage implements closeable. Modify wherever the LimitedTemporaryStorage is getting closed to subtract the value of the storage from the counter. There can be a set inclusion check where a hashset of the opened temporary storages being tracked by the running counter are stored, and the code confirms that the removed temporary storage was indeed being tracked by the counter.

This approach also requires keeping track of the temporary storages thus incurring similar overhead as the existing implementation. Additionally, it requires incrementing the "running counter" whenever bytes are written out to the disk.

What is the expense you are referring to

Iterating while calculating the stored bytes.

keeping track of the temporary storages

That is true, but calculating the stored bytes won't require an iteration over the stored memories.

Iterating while calculating the stored bytes.

I understand. Given that the monitor will run at a fixed interval (every minute or so), is the overhead significant?
I am weighing the implementation complexity against the performance impact.

…parately

processing/src/test/java/org/apache/druid/query/groupby/GroupByQueryMetricsTest.java

server/src/test/java/org/apache/druid/server/metrics/GroupByStatsMonitorTest.java

processing/src/test/java/org/apache/druid/query/groupby/GroupByQueryMetricsTest.java

server/src/test/java/org/apache/druid/server/metrics/GroupByStatsMonitorTest.java

LakshSingla · 2024-11-04T10:22:03Z

docs/configuration/index.md

@@ -408,6 +408,7 @@ Metric monitoring is an essential part of Druid operations. The following monito
 |`org.apache.druid.server.metrics.TaskSlotCountStatsMonitor`|Reports metrics about task slot usage per emission period.|
 |`org.apache.druid.server.metrics.WorkerTaskCountStatsMonitor`|Reports how many ingestion tasks are currently running/pending/waiting, the number of successful/failed tasks, and metrics about task slot usage for the reporting worker, per emission period. Only supported by Middle Manager node types.|
 |`org.apache.druid.server.metrics.ServiceStatusMonitor`|Reports a heartbeat for the service.|
+|`org.apache.druid.server.metrics.GroupByStatsMonitor`|Report statistics for groupBy queries like disk spill, merge buffer usage. |


nit : disk spill can be worded better.

Suggested change

|`org.apache.druid.server.metrics.GroupByStatsMonitor`|Report statistics for groupBy queries like disk spill, merge buffer usage. |

|`org.apache.druid.server.metrics.GroupByStatsMonitor`|Reports metrics for groupBy queries like disk and merge buffer utilized by them. |

LakshSingla · 2024-11-04T10:24:19Z

docs/operations/metrics.md

@@ -86,6 +85,13 @@ Most metric values reset each emission period, as specified in `druid.monitoring
 |`subquery/fallback/unknownReason/count`|Number of subqueries which cannot be materialized as frames due other reasons.|This metric is only available if the `SubqueryCountStatsMonitor` module is included.| |
 |`query/rowLimit/exceeded/count`|Number of queries whose inlined subquery results exceeded the given row limit|This metric is only available if the `SubqueryCountStatsMonitor` module is included.| |
 |`query/byteLimit/exceeded/count`|Number of queries whose inlined subquery results exceeded the given byte limit|This metric is only available if the `SubqueryCountStatsMonitor` module is included.| |
+|`mergeBuffer/pendingRequests`|Number of requests waiting to acquire a batch of buffers from the merge buffer pool.|This metric is only available if the `GroupByStatsMonitor` module is included.|Should be 0.|


Recommended value doesn't seem correct. It is fine if the queries are waiting on the merge buffer even during normal operation. Can mention, ideally 0, but it is fine if not (after rewording).

LakshSingla · 2024-11-04T10:26:05Z

docs/operations/metrics.md

@@ -86,6 +85,13 @@ Most metric values reset each emission period, as specified in `druid.monitoring
 |`subquery/fallback/unknownReason/count`|Number of subqueries which cannot be materialized as frames due other reasons.|This metric is only available if the `SubqueryCountStatsMonitor` module is included.| |
 |`query/rowLimit/exceeded/count`|Number of queries whose inlined subquery results exceeded the given row limit|This metric is only available if the `SubqueryCountStatsMonitor` module is included.| |
 |`query/byteLimit/exceeded/count`|Number of queries whose inlined subquery results exceeded the given byte limit|This metric is only available if the `SubqueryCountStatsMonitor` module is included.| |
+|`mergeBuffer/pendingRequests`|Number of requests waiting to acquire a batch of buffers from the merge buffer pool.|This metric is only available if the `GroupByStatsMonitor` module is included.|Should be 0.|
+|`mergeBuffer/usedCount`|Number of merge buffers used from the merge buffer pool.|This metric is only available if the `GroupByStatsMonitor` module is included.|Depends on the number of groupBy queries needing merge buffers.|
+|`mergeBuffer/acquisitionCount`|Number of times groupBy queries acquired merge buffers.|This metric is only available if the `GroupByStatsMonitor` module is included.|Depends on the number of groupBy queries needing merge buffers.|


Number of times groupBy queries acquired merge buffers.|

Is it cumulative or per query? A query should acquire merge buffer only once. I think if we reword the metric name or the documentation, it will be clear to the users.

It is cumulative, I have renamed it to mergeBuffer/queries.

LakshSingla · 2024-11-04T10:26:14Z

docs/operations/metrics.md

+|`mergeBuffer/pendingRequests`|Number of requests waiting to acquire a batch of buffers from the merge buffer pool.|This metric is only available if the `GroupByStatsMonitor` module is included.|Should be 0.|
+|`mergeBuffer/usedCount`|Number of merge buffers used from the merge buffer pool.|This metric is only available if the `GroupByStatsMonitor` module is included.|Depends on the number of groupBy queries needing merge buffers.|
+|`mergeBuffer/acquisitionCount`|Number of times groupBy queries acquired merge buffers.|This metric is only available if the `GroupByStatsMonitor` module is included.|Depends on the number of groupBy queries needing merge buffers.|
+|`mergeBuffer/acquisitionTimeNs`|Total time in nanoseconds to acquire merge buffer for groupBy queries.|This metric is only available if the `GroupByStatsMonitor` module is included.|Should be as low as possible.|


Suggested change

|`mergeBuffer/acquisitionTimeNs`|Total time in nanoseconds to acquire merge buffer for groupBy queries.|This metric is only available if the `GroupByStatsMonitor` module is included.|Should be as low as possible.|

|`mergeBuffer/acquisitionTimeNs`|Total time in nanoseconds to acquire merge buffer for groupBy queries.|This metric is only available if the `GroupByStatsMonitor` module is included.|Varies.|

LakshSingla · 2024-11-04T10:27:22Z

docs/operations/metrics.md

+|`mergeBuffer/pendingRequests`|Number of requests waiting to acquire a batch of buffers from the merge buffer pool.|This metric is only available if the `GroupByStatsMonitor` module is included.|Should be 0.|
+|`mergeBuffer/usedCount`|Number of merge buffers used from the merge buffer pool.|This metric is only available if the `GroupByStatsMonitor` module is included.|Depends on the number of groupBy queries needing merge buffers.|
+|`mergeBuffer/acquisitionTimeNs`|Total time in nanoseconds to acquire merge buffer for groupBy queries.|This metric is only available if the `GroupByStatsMonitor` module is included.|Should be as low as possible.|
+|`mergeBuffer/acquisitionCount`|Number of times groupBy queries acquired merge buffers.|This metric is only available if the `GroupByStatsMonitor` module is included.|Depends on the number of groupBy queries needing merge buffers.|
+|`groupBy/spilledQueries`|Number of groupBy queries that have spilled onto the disk.|This metric is only available if the `GroupByStatsMonitor` module is included.|Varies|
+|`groupBy/spilledBytes`|Number of bytes spilled on the disk by the groupBy queries.|This metric is only available if the `GroupByStatsMonitor` module is included.|Varies|
+|`groupBy/mergeDictionarySize`|Size of on-heap merge dictionary in bytes.|This metric is only available if the `GroupByStatsMonitor` module is included.|Varies|


I wonder if there's a way to reference all these in a single place instead. It seems wasteful that two copies of the same documentation are maintained. LMK if there are any prior metrics that can do that.

Moved them to a new section.

LakshSingla · 2024-11-04T13:22:05Z

processing/src/test/java/org/apache/druid/query/groupby/GroupByQueryRunnerTest.java

@@ -13682,4 +13755,16 @@ private void cannotVectorize()
      expectedException.expectMessage("Cannot vectorize!");
    }
  }
+
+  private void verifyMetrics(long queries, boolean skipMergeDictionary)


What is skipMergeDictionary and when is it required?

Renamed, this is basically to skip verifying the merge dictionary size metric, since some of the queries aggregate on non-string dimensions.

LakshSingla · 2024-11-04T13:22:25Z

processing/src/test/java/org/apache/druid/query/groupby/GroupByQueryRunnerTest.java

+    if (config.equals(V2_SMALL_BUFFER_CONFIG)) {
+      verifyMetrics(1, true);
+    }


Why are the assertions being done on a single config?

This is the most suited config to verify all the metrics. For example, in config with larger buffer size it is not possible to verify disk spill metrics. I am moved this check into the method and renamed the method.

LakshSingla · 2024-11-04T13:24:40Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

+    return aggregateStatsContainer.reset();
+  }
+
+  public static class AggregateStats


Why are some stats AggregateStats, and some PerQuery? Wouldn't the monitor aggregate them all?

PerQueryStats is created per query and is closed at the end of the query, AggregateStats is "reset" after each monitor cycle and all of the PerQueryStats are accumulated into AggregateStats when the query is closed.
Monitor simply reports the aggregate metrics.

LakshSingla · 2024-11-04T13:25:48Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

+/**
+ * Metrics collector for groupBy queries like spilled bytes, merge buffer acquistion time, dictionary size.
+ */
+@LazySingleton


Should it be registered in a Module if this is a LazySingleton? (I am not very familiar, just curious if this is working as expected)

Yes, it works as expected. Guice should automatically detect classes marked with this annotation and bind them in singleton scope.

Thanks for confirming!

LakshSingla · 2024-11-04T13:28:02Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

+  public synchronized void closeQuery(QueryResourceId resourceId)
+  {
+    if (resourceId == null || !perQueryStats.containsKey(resourceId)) {
+      return;
+    }
+    PerQueryStats container = perQueryStats.remove(resourceId);
+    aggregateStatsContainer.addQueryStats(container);
+  }


This is not intuitive. The callers won't remember calling closeQuery(). Is it possible to mold it as a Closer/Closable? PerQueryStats can extend Closeable, and the close can do that.

It is still not good since it doesn't seem cool to close "Stats" in first place. Perhaps renaming PerQueryStats to PerQueryStatsContainer would also help.

This is not intuitive. The callers won't remember calling closeQuery()

Even if it implements closable, it would require to be explicitly registered with the query closer?
I am doing the same thing in GroupByQueryQueryToolChest

closer.register(() -> groupByStatsProvider.closeQuery(query.context().getQueryResourceId()));

LMK what do you think?

it would require to be explicitly registered with the query closer

Yes. Implementing the interface makes it easier for the developer to call the closer. I have had experience in the past where an implicit contract was not apparent which led to resource leak. Not a blocker.

LakshSingla · 2024-11-06T02:52:45Z

server/src/main/java/org/apache/druid/server/metrics/GroupByStatsMonitor.java

+    emitter.emit(builder.setMetric("mergeBuffer/usedCount", mergeBufferPool.getUsedResourcesCount()));
+
+    GroupByStatsProvider.AggregateStats statsContainer = groupByStatsProvider.getStatsSince();
+
+    if (statsContainer.getMergeBufferAcquisitionCount() > 0) {
+      emitter.emit(builder.setMetric("mergeBuffer/acquisitionCount", statsContainer.getMergeBufferAcquisitionCount()));


usedCount and acquisitionCount are difficult to disambiguate. Can we rename them?

Renamed mergeBuffer/usedCount -> mergeBuffer/used, mergeBuffer/acquistionCount -> mergeBuffer/queries.

LMK if this is easier to understand?

LakshSingla · 2024-11-11T06:07:35Z

Waiting on a minor change. Will merge this post.

LakshSingla · 2024-11-22T08:52:58Z

The minor change was to use query id instead of resource reservation id in the logic, but that is not a user facing thing and can be picked up in subsequent patches. I have merged this.

emit metrics for groupby queries

6fcfcf5

github-actions bot added Area - Documentation Area - Ingestion labels Oct 16, 2024

cryptoe assigned LakshSingla Oct 16, 2024

findingrish added 3 commits October 17, 2024 01:15

Refactor GroupByStatsProvider, make it singleton, add test

f67ca4d

Merge remote-tracking branch 'upstream/master' into spill-metrics

555f487

Fix build

4397397

LakshSingla reviewed Oct 16, 2024

View reviewed changes

findingrish added 6 commits October 17, 2024 11:48

fix build

9e71dd1

Minor refactor, report total mergeBufferAcquisition time and count se…

4b0d3e9

…parately

Emit spilled bytes for completed queries in the GroupByStatsMonitor

4e03093

minor changes

ad03d0c

Merge remote-tracking branch 'upstream/master' into spill-metrics

ead93e8

fix spell check

9f2d5a9

asdf2014 added the Area - Metrics/Event Emitting label Oct 23, 2024

github-advanced-security bot found potential problems Oct 23, 2024

View reviewed changes

Emit mergeDictionarySize, refactor GroupByStatsProvider

cc2d93e

github-advanced-security bot found potential problems Oct 25, 2024

View reviewed changes

processing/src/test/java/org/apache/druid/query/groupby/GroupByQueryMetricsTest.java Fixed Show fixed Hide fixed

server/src/test/java/org/apache/druid/server/metrics/GroupByStatsMonitorTest.java Fixed Show fixed Hide fixed

findingrish added 5 commits October 25, 2024 14:43

Fix spellcheck

680aebe

Remove separate UT for group by metrics test

6757215

Another static method to make GroupByQueryRunnerFactory

2dd8ed2

Fix bug in populating query resource id in subqueries

2a45d30

Merge remote-tracking branch 'upstream/master' into spill-metrics

2795634

LakshSingla reviewed Nov 4, 2024

View reviewed changes

review comments

b199d57

LakshSingla reviewed Nov 6, 2024

View reviewed changes

LakshSingla approved these changes Nov 6, 2024

View reviewed changes

review comments

518c2a7

LakshSingla merged commit 74422b5 into apache:master Nov 22, 2024
88 of 90 checks passed

adarshsanjeev added this to the 32.0.0 milestone Jan 16, 2025

adarshsanjeev mentioned this pull request Jan 28, 2025

[DRAFT] 32.0.0 release notes #17677

Open


		private final ConcurrentLinkedQueue<LimitedTemporaryStorage> temporaryStorages;

		@Inject

		private final AtomicLong resourceAcquisitionTimeNs = new AtomicLong(0);
		private final AtomicLong resourceAcquisitionCount = new AtomicLong(0);

	\|`org.apache.druid.server.metrics.QueryCountStatsMonitor`\|Reports how many queries have been successful/failed/interrupted. It also reports stats for group by queries.\|
	\|`org.apache.druid.server.metrics.QueryCountStatsMonitor`\|Reports how many queries have been successful/failed/interrupted. It also reports statistics for the group by queries.\|

	\|`org.apache.druid.server.metrics.GroupByStatsMonitor`\|Report statistics for groupBy queries like disk spill, merge buffer usage. \|
	\|`org.apache.druid.server.metrics.GroupByStatsMonitor`\|Reports metrics for groupBy queries like disk and merge buffer utilized by them. \|

	\|`mergeBuffer/acquisitionTimeNs`\|Total time in nanoseconds to acquire merge buffer for groupBy queries.\|This metric is only available if the `GroupByStatsMonitor` module is included.\|Should be as low as possible.\|
	\|`mergeBuffer/acquisitionTimeNs`\|Total time in nanoseconds to acquire merge buffer for groupBy queries.\|This metric is only available if the `GroupByStatsMonitor` module is included.\|Varies.\|

Emit disk spill and merge buffer utilisation metrics for GroupBy queries #17360

Emit disk spill and merge buffer utilisation metrics for GroupBy queries #17360

Conversation

findingrish commented Oct 16, 2024 • edited Loading

Release Notes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LakshSingla Oct 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LakshSingla Nov 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LakshSingla commented Nov 11, 2024 • edited Loading

LakshSingla commented Nov 22, 2024

findingrish commented Oct 16, 2024 •

edited

Loading

LakshSingla Oct 18, 2024 •

edited

Loading

LakshSingla Nov 6, 2024 •

edited

Loading

LakshSingla commented Nov 11, 2024 •

edited

Loading