Skip to content

Commit

Permalink
k
Browse files Browse the repository at this point in the history
  • Loading branch information
holtzy committed Jan 16, 2025
1 parent 2dbc59e commit cfdc1b3
Show file tree
Hide file tree
Showing 6 changed files with 109 additions and 107 deletions.
113 changes: 65 additions & 48 deletions pages/example/t-test-playground.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,20 @@ import TitleAndDescription from 'component/TitleAndDescription';
import ChartFamilySection from 'component/ChartFamilySection';
import { CodeBlock } from 'component/UI/CodeBlock';
import { ChartOrSandbox } from 'component/ChartOrSandbox';
import Link from 'next/link';
import { ArcDiagramVerticalDemo } from 'viz/ArcDiagramVertical/ArcDiagramVerticalDemo';
import { LinkAsButton } from 'component/LinkAsButton';
import { ScatterplotCanvasBasicDemo } from 'viz/ScatterplotCanvas/ScatterplotCanvasBasicDemo';
import { ScatterplotR2PlaygroundDemo } from '@/viz/ScatterplotR2Playground/ScatterplotR2PlaygroundDemo';

import { TTestPlaygroundDemo } from '@/viz/TTestPlayground/TTestPlayground';
import Link from 'next/link';

const graphDescription = (
<>
<p>p-value is nothing without size effect.</p>
<p>
This post features an interactive sandbox that explores several edge
cases, demonstrating how relying on these summary statistics without
visualizing the data can be <b>dangerously misleading</b>.
You just spent hours analyzing data and got a <b>p-value of 0.051</b>.
Does that make your findings meaningless? Would 0.049 really change
everything?
</p>
<p>
This post explains why <b>chasing statistical significance</b> often
misses the bigger picture. <b>Practical significance</b> matters more.
</p>
</>
);
Expand All @@ -33,77 +33,94 @@ export default function Home() {

{/*
//
// What is R2
//
//
*/}
<h2 id="definition">🤔 What are p-value and effect size</h2>
<h3>&rarr; p-value</h3>
<h2 id="t-test">🤔 Comparing Two Groups</h2>
<h3>&rarr; Data and Question</h3>
<p>
R², or the{' '}
<a
href="https://en.wikipedia.org/wiki/Coefficient_of_determination"
target="_blank"
>
coefficient of determination
</a>
, measures the <b>proportion of variance</b> in the <u>dep</u>endent
variable that is explained by the <u>indep</u>endent variable in a
regression model.
You have values for <b>two groups</b>. Calculating the <b>mean</b> and{' '}
<b>variance</b> for each group is straightforward.
</p>
<p>
It ranges from <code>0</code> to <code>1</code>, with higher values
indicating a stronger linear relationship.
But are the differences statistically significant? Can we conclude that
the groups are <b>meaningfully different</b>, or could the observed
differences be due to random chance?
</p>

<h3>&rarr; effect size</h3>
<h3>&rarr; T-Test, p-Value, and Effect Size</h3>
<p>
The{' '}
<a
href="https://en.wikipedia.org/wiki/Correlation_coefficient"
href="https://en.wikipedia.org/wiki/Student%27s_t-test"
target="_blank"
>
correlation coefficient
t-test
</a>{' '}
(<code>r</code>) measures the <b>strength</b> and <b>direction</b> of a
linear relationship between two variables, ranging from <code>-1</code>{' '}
to <code>1</code>. R² is actually the square of the correlation
coefficient in a simple linear regression!
</p>
<p>
The correlation describes the <b>relationship</b> directly, R² focuses
on the <b>explanatory power </b>of a regression model.
is a statistical method designed precisely for this purpose. The result
is a <b>p-value</b>, which indicates the{' '}
<b>probability of observing the data </b>if the null hypothesis (no
difference between groups) is true.
</p>

{/*
//
// Plot and code
//
*/}
<h2 id="sandbox">🎮 Scatterplot, R², and Draggable Circles</h2>
<h2 id="sandbox">🎮 Visualizing the p-value</h2>
<p>
Summary statistics are popular because they condense large datasets into
a few <b>easy-to-understand numbers</b>. However, relying solely on them
can lead to a <b>false sense of clarity</b>.
The sliders below let you experiment with <b>sample size</b>,{' '}
<b>effect size</b>
(average difference), and <b>standard deviation</b>. The values of both
groups are displayed on a <Link href="/boxplot">boxplot</Link> using
jittering to illustrate their distribution.
</p>
<p>
The graph below showcases datasets with high R² and correlation values,
even when there's clearly <b>no meaningful relationship</b> between x
and y.
Adjust the sliders to build an intuition about how the p-value changes!
</p>
<p>
Bonus: the circles are <b>draggable</b>! Experiment by moving them
around and watch how the R² and correlation change in real time. It’s a
great way to build intuition about these metrics.
For example, consider this scenario. With a standard deviation of{' '}
<code>4</code> and an effect size of <code>2</code>, the difference
won't be statistically significant with a sample size of <code>25</code>
. However, it will become significant if the sample size is increased to{' '}
<code>50</code>!
</p>

<ChartOrSandbox
vizName={'TTestPlayground'}
VizComponent={TTestPlaygroundDemo}
maxWidth={500}
height={580}
caption="Understand the pvalue."
maxWidth={700}
height={880}
caption="Explore how the p-value behaves."
/>

<h2 id="conclusion">Conclusion</h2>
<p>
In summary, increasing the sample size will always make a difference
statistically significant eventually. But{' '}
<b>does that mean it truly matters</b>? Sometimes yes, sometimes no—it
depends on the context.
</p>

<p>
Statistical tools like the p-values are valuable for analyzing data, but
they're just <b>one part of the bigger picture</b>.
</p>
<p>
Focusing solely on whether a result is "statistically significant" can
lead to <b>misleading interpretations</b> and overlook the practical
importance of findings.
</p>
<p>
By considering effect sizes, confidence intervals, and the context of
your data, you can draw conclusions that are not only statistically
sound but also <b>meaningful and impactful</b>.
</p>
<p>
Stop chasing the p-value. Start seeking the story behind the numbers!
</p>

<div className="full-bleed border-t h-0 bg-gray-100 mb-3 mt-24" />
<ChartFamilySection chartFamily="flow" />
<div className="mt-20" />
Expand Down
10 changes: 3 additions & 7 deletions viz/TTestPlayground/AxisLeft.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -31,18 +31,14 @@ export const AxisLeft = ({ yScale, pixelsPerTick, width }: AxisLeftProps) => {
transform={`translate(0, ${yOffset})`}
shapeRendering={'crispEdges'}
>
<line
x1={-TICK_LENGTH}
x2={width + TICK_LENGTH}
stroke="#D2D7D3"
strokeWidth={0.5}
/>
<line x1={-TICK_LENGTH} x2={800} stroke="#D2D7D3" strokeWidth={0.5} />
<text
key={value}
style={{
fontSize: '10px',
textAnchor: 'middle',
transform: 'translateX(-20px)',
alignmentBaseline: 'central',
transform: 'translate(-20px, -2px)',
fill: '#D2D7D3',
}}
>
Expand Down
17 changes: 7 additions & 10 deletions viz/TTestPlayground/Boxplot.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import { getSummaryStats } from './summary-stats';
import { AxisLeft } from './AxisLeft';
import { AxisBottom } from './AxisBottomCategoric';
import { VerticalBox } from './VerticalBox';
import { Circle } from './Circle';

const MARGIN = { top: 30, right: 30, bottom: 30, left: 50 };
const JITTER_WIDTH = 40;
Expand All @@ -19,6 +20,8 @@ export const Boxplot = ({ width, height, data }: BoxplotProps) => {
const boundsWidth = width - MARGIN.right - MARGIN.left;
const boundsHeight = height - MARGIN.top - MARGIN.bottom;

const sizeScale = d3.scaleLinear().domain([5, 420]).range([15, 2]);

// Compute everything derived from the dataset:
const { chartMin, chartMax, groups } = useMemo(() => {
const [chartMin, chartMax] = d3.extent(data.map((d) => d.value)) as [
Expand All @@ -32,7 +35,7 @@ export const Boxplot = ({ width, height, data }: BoxplotProps) => {
}, [data]);

// Compute scales
const yScale = d3.scaleLinear().domain([-1, 20]).range([boundsHeight, 0]);
const yScale = d3.scaleLinear().domain([-10, 30]).range([boundsHeight, 0]);

const xScale = d3
.scaleBand()
Expand All @@ -58,17 +61,15 @@ export const Boxplot = ({ width, height, data }: BoxplotProps) => {
const { min, q1, median, q3, max } = sumStats;

const allCircles = groupData.map((value, i) => (
<circle
<Circle
key={i}
cx={
xScale.bandwidth() / 2 -
JITTER_WIDTH / 2 +
Math.random() * JITTER_WIDTH
}
cy={yScale(value)}
r={4}
fill="grey"
fillOpacity={0.3}
r={sizeScale(data.length)}
/>
));

Expand Down Expand Up @@ -97,12 +98,8 @@ export const Boxplot = ({ width, height, data }: BoxplotProps) => {
height={boundsHeight}
transform={`translate(${[MARGIN.left, MARGIN.top].join(',')})`}
>
{allShapes}
<AxisLeft yScale={yScale} pixelsPerTick={30} />
{/* X axis uses an additional translation to appear at the bottom */}
<g transform={`translate(0, ${boundsHeight})`}>
<AxisBottom xScale={xScale} />
</g>
{allShapes}
</g>
</svg>
</div>
Expand Down
28 changes: 4 additions & 24 deletions viz/TTestPlayground/Circle.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -4,41 +4,21 @@ type CircleVizProps = {
r: number;
cx: number;
cy: number;
fill: string;
stroke: string;
fillOpacity: number;
strokeWidth: number;
onMouseDown: () => void;
onMouseUp: () => void;
};

export const Circle = ({
r,
cx,
cy,
fill,
stroke,
fillOpacity,
strokeWidth,
onMouseDown,
onMouseUp,
}: CircleVizProps) => {
export const Circle = ({ r, cx, cy }: CircleVizProps) => {
const springProps = useSpring({
to: { r, cx, cy, fill, stroke, fillOpacity, strokeWidth },
to: { r, cx, cy },
});

return (
<animated.circle
cursor={'pointer'}
strokeWidth={springProps.strokeWidth}
fillOpacity={springProps.fillOpacity}
r={springProps.r}
cy={springProps.cy}
cx={springProps.cx}
stroke={springProps.stroke}
fill={springProps.fill}
onMouseDown={onMouseDown}
onMouseUp={onMouseUp}
fill="grey"
fillOpacity={0.3}
/>
);
};
23 changes: 14 additions & 9 deletions viz/TTestPlayground/TTestPlayground.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ import { useState } from 'react';
import { Boxplot } from './Boxplot';
import { jStat } from 'jstat';

const HEADER_HEIGHT = 280;
const HEADER_HEIGHT = 180;

export const TTestPlaygroundDemo = ({ width = 700, height = 400 }) => {
const [sampleSize, setSampleSize] = useState(1000);
const [sampleSize, setSampleSize] = useState(100);
const [effectSize, setEffectSize] = useState(0);
const [stDev, setStDev] = useState(1);

Expand Down Expand Up @@ -37,14 +37,13 @@ export const TTestPlaygroundDemo = ({ width = 700, height = 400 }) => {
const res = calculatePValue(vals1, vals2);

const parameterSliders = (
<div className="flex flex-col items-start gap-2">
<span className="mt-2 font-thin">&rarr; Parameters</span>
<div className="flex flex-col items-start gap-2 py-6">
<div style={{ display: 'flex', alignItems: 'center' }}>
<span className="text-sm w-32">Sample Size:</span>
<input
type="range"
min={5}
max={1000}
max={200}
value={sampleSize}
step={5}
onChange={(e) => setSampleSize(Number(e.target.value))}
Expand Down Expand Up @@ -84,16 +83,22 @@ export const TTestPlaygroundDemo = ({ width = 700, height = 400 }) => {
);

const results = (
<div className="flex flex-col items-start gap-2 mt-8">
<span className="mt-2 font-thin">&rarr; Results</span>
<span className="text-sm">{'p-value is ' + res.pValue?.toFixed(6)}</span>
<div className="pt-2">
<span className="text-sm border rounded-sm px-2 py-1 bg-slate-100">
{'p-value is ' + res.pValue?.toFixed(6)}
</span>
<span className="text-sm ml-6">
{res.pValue <= 0.05 && <span>✅ Significant</span>}
{res.pValue > 0.05 && <span>❌ Not Significant</span>}
</span>
</div>
);

return (
<div>
<div style={{ height: HEADER_HEIGHT }}>
<div style={{ height: HEADER_HEIGHT }} className="pl-12">
{parameterSliders}
<hr />
{results}
</div>

Expand Down
25 changes: 16 additions & 9 deletions viz/TTestPlayground/VerticalBox.tsx
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import { animated, useSpring, config } from 'react-spring';

const STROKE_WIDTH = 40;

// A reusable component that builds a vertical box shape using svg
Expand All @@ -24,29 +26,34 @@ export const VerticalBox = ({
stroke,
fill,
}: VerticalBoxProps) => {
const springProps = useSpring({
to: { min, q1, median, q3, max, diff: q1 - q3 },
config: config.molasses,
});

return (
<>
<line
<animated.line
x1={width / 2}
x2={width / 2}
y1={min}
y2={max}
y1={springProps.min}
y2={springProps.max}
stroke={stroke}
width={STROKE_WIDTH}
/>
<rect
<animated.rect
x={0}
y={q3}
y={springProps.q3}
width={width}
height={q1 - q3}
height={springProps.diff}
stroke={stroke}
fill={fill}
/>
<line
<animated.line
x1={0}
x2={width}
y1={median}
y2={median}
y1={springProps.median}
y2={springProps.median}
stroke={stroke}
width={STROKE_WIDTH}
/>
Expand Down

0 comments on commit cfdc1b3

Please sign in to comment.