Skip to content

Commit

Permalink
docs: fix 404 links (#17617)
Browse files Browse the repository at this point in the history
  • Loading branch information
vtlim authored Jan 10, 2025
1 parent 12e88b7 commit c3e5977
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 5 deletions.
10 changes: 6 additions & 4 deletions docs/development/extensions-core/test-stats.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
id: test-stats
title: "Test Stats Aggregators"
title: "Test stats aggregators"
---

<!--
Expand All @@ -23,13 +23,14 @@ title: "Test Stats Aggregators"
-->


This Apache Druid extension incorporates test statistics related aggregators, including z-score and p-value. Please refer to [https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/](https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/) for math background and details.
The `druid-stats` extension for Apache Druid incorporates aggregators to compute test statistics, including z-scores and p-values.
Please refer to [Democratizing Experimentation Data for Product Innovations](https://medium.com/paypal-tech/democratizing-experimentation-data-for-product-innovations-8b6e1cf40c27) for math background and details.

Make sure to include `druid-stats` extension in order to use these aggregators.

## Z-Score for two sample ztests post aggregator

Please refer to [https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-proportions-test/](https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-proportions-test/) and [http://www.ucs.louisiana.edu/~jcb0773/Berry_statbook/Berry_statbook_chpt6.pdf](http://www.ucs.louisiana.edu/~jcb0773/Berry_statbook/Berry_statbook_chpt6.pdf) for more details.
Please refer to [Making Sense of the Two-Proportions Test](https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-proportions-test/) and [An Introduction to Statistics: Comparing Two Means](https://userweb.ucs.louisiana.edu/~jcb0773/Berry_statbook/427bookall-August2024.pdf) for more details.

z = (p1 - p2) / S.E. (assuming null hypothesis is true)

Expand All @@ -41,6 +42,7 @@ S.E. = sqrt{ p1 * ( 1 - p1 )/n1 + p2 * (1 - p2)/n2) }
(p1 – p2) is the observed difference between two sample proportions.

### zscore2sample post aggregator

* **`zscore2sample`**: calculate the z-score using two-sample z-test while converting binary variables (***e.g.*** success or not) to continuous variables (***e.g.*** conversion rate).

```json
Expand Down Expand Up @@ -74,7 +76,7 @@ p2 = (successCount2) / (sample size 2)
}
```

## Example Usage
## Example usage

In this example, we use zscore2sample post aggregator to calculate z-score, and then feed the z-score to pvalue2tailedZtest post aggregator to calculate p-value.

Expand Down
2 changes: 1 addition & 1 deletion docs/querying/sql-translation.md
Original file line number Diff line number Diff line change
Expand Up @@ -803,7 +803,7 @@ the query hits `maxStreamLength`: the maximum number of items to store in each s
See [GitHub issue 11544](https://github.com/apache/druid/issues/11544) for more details.
To workaround the issue, increase value of the maximum string length with the `approxQuantileDsMaxStreamLength` parameter
in the query context. Since it is set to 1,000,000,000 by default, you don't need to override it in most cases.
See [accuracy information](https://datasketches.apache.org/docs/Quantiles/OrigQuantilesSketch) in the DataSketches documentation for how many bytes are required per stream length.
See [accuracy information](https://datasketches.apache.org/docs/Quantiles/ClassicQuantilesSketch.html) in the DataSketches documentation for how many bytes are required per stream length.
This query context parameter is a temporary solution to avoid the known issue. It may be removed in a future release after the bug is fixed.

## Unsupported features
Expand Down

0 comments on commit c3e5977

Please sign in to comment.