Merge pull request #49 from PeerHerholz/updates_2024

fix n < p description
main-educational · Oct 25, 2024 · d4f6d47 · d4f6d47
2 parents 195bd71 + 801905b
commit d4f6d47
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/content/haxby_data.ipynb b/content/haxby_data.ipynb
@@ -51,7 +51,7 @@
     "```{admonition} Bonus question: ever heard of the \"small-n-high-p\" (p >> n) problem?\n",
     ":class: tip, dropdown\n",
     "\n",
-    "\"Classical\" `machine learning`/`decoding` models and the underlying algorithms operate on the assumption that are more `predictors` or `features` than there are `sample`. In fact many more. Why is that?\n",
+    "\"Classical\" `machine learning`/`decoding` models and the underlying algorithms operate on the assumption that are more `samples` than there are `predictors` or `features` . In fact many more. Why is that?\n",
     "Consider a high-dimensional `space` whose `dimensions` are defined by the number of `features` (e.g. `10 features` would result in a space with `10 dimensions`. The resulting `volume` of this `space` is the amount of `samples` that could be drawn from the `domain` and the number of `samples` entail the `samples` you need to address your `learning problem`, ie `decoding` outcome. That is why folks say: \"get more data\", `machine learning` is `data`-hungry: our `sample` needs to be as representative of the high-dimensional domain as possible. Thus, as the number of `features` increases, so should the number of `samples` so to capture enough of the `space` for the `decoding model` at hand.\n",
     "\n",
     "This referred to as the [curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality) and poses as a major problem in many fields that aim to utilize `machine learning`/`decoding` on unsuitable data. Why is that?\n",