Skip to content

Commit

Permalink
No content updayr -- just check all still works
Browse files Browse the repository at this point in the history
  • Loading branch information
mine-cetinkaya-rundel committed May 29, 2024
1 parent 3b02997 commit e6596a1
Show file tree
Hide file tree
Showing 6 changed files with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion _freeze/html/factors/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
{
"hash": "49601350c49c834abf551b0a9230b77d",
"result": {
"markdown": "---\ntitle: \"Factors with forcats :: Cheatsheet\"\ndescription: \" \"\nimage-alt: \"\"\nexecute:\n eval: true\n output: false\n warning: false\n---\n\n::: {.cell .column-margin}\n<img src=\"images/logo-forcats.png\" height=\"138\" alt=\"Hex logo for forcats - drawing of four black cats lounging in a cardboard box. On one side of the box it says 'for' and on the adjacent side is says 'cats'.\" />\n<br><br><a href=\"../factors.pdf\">\n<p><i class=\"bi bi-file-pdf\"></i> Download PDF</p>\n<img src=\"../pngs/factors.png\" width=\"200\" alt=\"\"/>\n</a>\n<br><br><p>Translations (PDF)</p>\n* <a href=\"../translations/japanese/factors_ja.pdf\"><i class=\"bi bi-file-pdf\"></i>Japanese</a>\n* <a href=\"../translations/spanish/factors_es.pdf\"><i class=\"bi bi-file-pdf\"></i>Spanish</a>\n:::\n\n\nThe **forcats** package provides tools for working with factors, which are R's data structure for categorical data.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(forcats)\n```\n:::\n\n\n\n\n## Factors\n\nR represents categorical data with factors.\nA **factor** is an integer vector with a **levels** attribute that stores a set of mappings between integers and categorical values.\nWhen you view a factor, R displays not the integers but the levels associated with them.\n\nFor example, R will display `c(\"a\", \"c\", \"b\", \"a\")` with levels `c(\"a\", \"b\", \"c\")` but will store `c(1, 3, 2, 1)` where 1 = a, 2 = b, and 3 = c.\n\nR will display:\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\n[1] a c b a\nLevels: a b c\n```\n:::\n:::\n\n\nR will store:\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\n[1] 1 3 2 1\nattr(,\"levels\")\n[1] \"a\" \"b\" \"c\"\n```\n:::\n:::\n\n\nCreate a factor with `factor()`:\n\n- `factor(x = character(), levels, labels = levels, exclude = NA, ordered = is.ordered(x), nmax = NA)`: Convert a vector to a factor.\n Also `as_factor()`.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f <- factor(c(\"a\", \"c\", \"b\", \"a\"), levels = c(\"a\", \"b\", \"c\"))\n ```\n :::\n\n\nReturn its levels with `levels()`:\n\n- `levels(x)`: Return/set the levels of a factor.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n levels(f)\n levels(f) <- c(\"x\", \"y\", \"z\")\n ```\n :::\n\n\nUse `unclass()` to see its structure.\n\n## Inspect Factors\n\n- `fct_count(f, sort = FALSE, prop = FALSE)`: Count the number of values with each level.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_count(f)\n ```\n :::\n\n\n- `fct_match(f, lvls)`: Check for `lvls` in `f`.\n\n\n \n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_match(f, \"a\")\n ```\n :::\n\n\n- `fct_unique(f)`: Return the unique values, removing duplicates.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_unique(f)\n ```\n :::\n\n\n## Combine Factors\n\n- `fct_c(...)`: Combine factors with different levels.\n Also `fct_cross()`.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f1 <- factor(c(\"a\", \"c\"))\n f2 <- factor(c(\"b\", \"a\"))\n fct_c(f1, f2)\n ```\n :::\n\n\n- `fct_unify(fs, levels = lvls_union(fs))`: Standardize levels across a list of factors.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_unify(list(f2, f1))\n ```\n :::\n\n\n## Change the order of levels\n\n- `fct_relevel(.f, ..., after = 0L)`: Manually reorder factor levels.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_relevel(f, c(\"b\", \"c\", \"a\"))\n ```\n :::\n\n\n- `fct_infreq(f, ordered = NA)`: Reorder levels by the frequency in which they appear in the data (highest frequency first).\n Also `fct_inseq()`.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f3 <- factor(c(\"c\", \"c\", \"a\"))\n fct_infreq(f3)\n ```\n :::\n\n\n- `fct_inorder(f, ordered = NA)`: Reorder levels by order in which they appear in the data.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_inorder(f2)\n ```\n :::\n\n\n- `fct_rev(f)`: Reverse level order.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f4 <- factor(c(\"a\",\"b\",\"c\"))\n fct_rev(f4)\n ```\n :::\n\n\n- `fct_shift(f)`: Shift levels to left or right, wrapping around end.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_shift(f4)\n ```\n :::\n\n\n- `fct_shuffle(f, n = 1L)`: Randomly permute order of factor levels.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_shuffle(f4)\n ```\n :::\n\n\n- `fct_reorder(.f, .x, .fun = median, ..., .desc = FALSE)`: Reorder levels by their relationship with another variable.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n boxplot(PlantGrowth, weight ~ fct_reorder(group, weight))\n ```\n :::\n\n\n- `fct_reorder2(.f, .x, .y, .fun = last2, ..., .desc = TRUE)`: Reorder levels by their final values when plotted with two other variables.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n ggplot(\n diamonds,\n aes(carat, price, color = fct_reorder2(color, carat, price))\n ) + \n geom_smooth()\n ```\n :::\n\n\n## Change the value of levels\n\n- `fct_recode(.f, ...)`: Manually change levels.\n Also `fct_relabel()` which obeys `purrr::map` syntax to apply a function or expression to each level.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_recode(f, v = \"a\", x = \"b\", z = \"c\")\n fct_relabel(f, ~ paste0(\"x\", .x))\n ```\n :::\n\n\n- `fct_anon(f, prefix = \"\")`: Anonymize levels with random integers.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_anon(f)\n ```\n :::\n\n\n- `fct_collapse(.f, …, other_level = NULL)`: Collapse levels into manually defined groups.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_collapse(f, x = c(\"a\", \"b\"))\n ```\n :::\n\n\n- `fct_lump_min(f, min, w = NULL, other_level = \"Other\")`: Lumps together factors that appear fewer than `min` times.\n Also `fct_lump_n()`, `fct_lump_prop()`, and `fct_lump_lowfreq()`.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_lump_min(f, min = 2)\n ```\n :::\n\n\n- `fct_other(f, keep, drop, other_level = \"Other\")`: Replace levels with \"other.\"\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_other(f, keep = c(\"a\", \"b\"))\n ```\n :::\n\n\n## Add or drop levels\n\n- `fct_drop(f, only)`: Drop unused levels.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f5 <- factor(c(\"a\",\"b\"),c(\"a\",\"b\",\"x\"))\n f6 <- fct_drop(f5)\n ```\n :::\n\n\n- `fct_expand(f, ...)`: Add levels to a factor.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_expand(f6, \"x\")\n ```\n :::\n\n\n- `fct_na_value_to_level(f, level = \"(Missing)\")`: Assigns a level to NAs to ensure they appear in plots, etc.\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f <- factor(c(\"a\", \"b\", NA))\n fct_na_value_to_level(f, level = \"(Missing)\")\n ```\n :::\n\n\n------------------------------------------------------------------------\n\nCC BY SA Posit Software, PBC • [info\\@posit.co](mailto:[email protected]) • [posit.co](https://posit.co)\n\nLearn more at [forcats.tidyverse.org](https://forcats.tidyverse.org).\n\nUpdated: 2023-06.\n\n\n::: {.cell}\n\n```{.r .cell-code}\npackageVersion(\"forcats\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] '1.0.0'\n```\n:::\n:::\n\n\n------------------------------------------------------------------------\n",
"engine": "knitr",
"markdown": "---\ntitle: \"Factors with forcats :: Cheatsheet\"\ndescription: \" \"\nimage-alt: \"\"\nexecute:\n eval: true\n output: false\n warning: false\n---\n\n::: {.cell .column-margin}\n<img src=\"images/logo-forcats.png\" height=\"138\" alt=\"Hex logo for forcats - drawing of four black cats lounging in a cardboard box. On one side of the box it says 'for' and on the adjacent side is says 'cats'.\" />\n<br><br><a href=\"../factors.pdf\">\n<p><i class=\"bi bi-file-pdf\"></i> Download PDF</p>\n<img src=\"../pngs/factors.png\" width=\"200\" alt=\"\"/>\n</a>\n<br><br><p>Translations (PDF)</p>\n* <a href=\"../translations/japanese/factors_ja.pdf\"><i class=\"bi bi-file-pdf\"></i>Japanese</a>\n* <a href=\"../translations/portuguese/factors_pt_br.pdf\"><i class=\"bi bi-file-pdf\"></i>Portuguese</a>\n* <a href=\"../translations/spanish/factors_es.pdf\"><i class=\"bi bi-file-pdf\"></i>Spanish</a>\n:::\n\n\n\nThe **forcats** package provides tools for working with factors, which are R's data structure for categorical data.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(forcats)\n```\n:::\n\n\n\n\n\n## Factors\n\nR represents categorical data with factors.\nA **factor** is an integer vector with a **levels** attribute that stores a set of mappings between integers and categorical values.\nWhen you view a factor, R displays not the integers but the levels associated with them.\n\nFor example, R will display `c(\"a\", \"c\", \"b\", \"a\")` with levels `c(\"a\", \"b\", \"c\")` but will store `c(1, 3, 2, 1)` where 1 = a, 2 = b, and 3 = c.\n\nR will display:\n\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] a c b a\nLevels: a b c\n```\n\n\n:::\n:::\n\n\n\nR will store:\n\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 1 3 2 1\nattr(,\"levels\")\n[1] \"a\" \"b\" \"c\"\n```\n\n\n:::\n:::\n\n\n\nCreate a factor with `factor()`:\n\n- `factor(x = character(), levels, labels = levels, exclude = NA, ordered = is.ordered(x), nmax = NA)`: Convert a vector to a factor.\n Also `as_factor()`.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f <- factor(c(\"a\", \"c\", \"b\", \"a\"), levels = c(\"a\", \"b\", \"c\"))\n ```\n :::\n\n\n\nReturn its levels with `levels()`:\n\n- `levels(x)`: Return/set the levels of a factor.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n levels(f)\n levels(f) <- c(\"x\", \"y\", \"z\")\n ```\n :::\n\n\n\nUse `unclass()` to see its structure.\n\n## Inspect Factors\n\n- `fct_count(f, sort = FALSE, prop = FALSE)`: Count the number of values with each level.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_count(f)\n ```\n :::\n\n\n\n- `fct_match(f, lvls)`: Check for `lvls` in `f`.\n\n\n\n \n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_match(f, \"a\")\n ```\n :::\n\n\n\n- `fct_unique(f)`: Return the unique values, removing duplicates.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_unique(f)\n ```\n :::\n\n\n\n## Combine Factors\n\n- `fct_c(...)`: Combine factors with different levels.\n Also `fct_cross()`.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f1 <- factor(c(\"a\", \"c\"))\n f2 <- factor(c(\"b\", \"a\"))\n fct_c(f1, f2)\n ```\n :::\n\n\n\n- `fct_unify(fs, levels = lvls_union(fs))`: Standardize levels across a list of factors.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_unify(list(f2, f1))\n ```\n :::\n\n\n\n## Change the order of levels\n\n- `fct_relevel(.f, ..., after = 0L)`: Manually reorder factor levels.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_relevel(f, c(\"b\", \"c\", \"a\"))\n ```\n :::\n\n\n\n- `fct_infreq(f, ordered = NA)`: Reorder levels by the frequency in which they appear in the data (highest frequency first).\n Also `fct_inseq()`.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f3 <- factor(c(\"c\", \"c\", \"a\"))\n fct_infreq(f3)\n ```\n :::\n\n\n\n- `fct_inorder(f, ordered = NA)`: Reorder levels by order in which they appear in the data.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_inorder(f2)\n ```\n :::\n\n\n\n- `fct_rev(f)`: Reverse level order.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f4 <- factor(c(\"a\",\"b\",\"c\"))\n fct_rev(f4)\n ```\n :::\n\n\n\n- `fct_shift(f)`: Shift levels to left or right, wrapping around end.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_shift(f4)\n ```\n :::\n\n\n\n- `fct_shuffle(f, n = 1L)`: Randomly permute order of factor levels.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_shuffle(f4)\n ```\n :::\n\n\n\n- `fct_reorder(.f, .x, .fun = median, ..., .desc = FALSE)`: Reorder levels by their relationship with another variable.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n boxplot(PlantGrowth, weight ~ fct_reorder(group, weight))\n ```\n :::\n\n\n\n- `fct_reorder2(.f, .x, .y, .fun = last2, ..., .desc = TRUE)`: Reorder levels by their final values when plotted with two other variables.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n ggplot(\n diamonds,\n aes(carat, price, color = fct_reorder2(color, carat, price))\n ) + \n geom_smooth()\n ```\n :::\n\n\n\n## Change the value of levels\n\n- `fct_recode(.f, ...)`: Manually change levels.\n Also `fct_relabel()` which obeys `purrr::map` syntax to apply a function or expression to each level.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_recode(f, v = \"a\", x = \"b\", z = \"c\")\n fct_relabel(f, ~ paste0(\"x\", .x))\n ```\n :::\n\n\n\n- `fct_anon(f, prefix = \"\")`: Anonymize levels with random integers.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_anon(f)\n ```\n :::\n\n\n\n- `fct_collapse(.f, …, other_level = NULL)`: Collapse levels into manually defined groups.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_collapse(f, x = c(\"a\", \"b\"))\n ```\n :::\n\n\n\n- `fct_lump_min(f, min, w = NULL, other_level = \"Other\")`: Lumps together factors that appear fewer than `min` times.\n Also `fct_lump_n()`, `fct_lump_prop()`, and `fct_lump_lowfreq()`.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_lump_min(f, min = 2)\n ```\n :::\n\n\n\n- `fct_other(f, keep, drop, other_level = \"Other\")`: Replace levels with \"other.\"\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_other(f, keep = c(\"a\", \"b\"))\n ```\n :::\n\n\n\n## Add or drop levels\n\n- `fct_drop(f, only)`: Drop unused levels.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f5 <- factor(c(\"a\",\"b\"),c(\"a\",\"b\",\"x\"))\n f6 <- fct_drop(f5)\n ```\n :::\n\n\n\n- `fct_expand(f, ...)`: Add levels to a factor.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n fct_expand(f6, \"x\")\n ```\n :::\n\n\n\n- `fct_na_value_to_level(f, level = \"(Missing)\")`: Assigns a level to NAs to ensure they appear in plots, etc.\n\n\n\n ::: {.cell}\n \n ```{.r .cell-code}\n f <- factor(c(\"a\", \"b\", NA))\n fct_na_value_to_level(f, level = \"(Missing)\")\n ```\n :::\n\n\n\n------------------------------------------------------------------------\n\nCC BY SA Posit Software, PBC • [info\\@posit.co](mailto:[email protected]) • [posit.co](https://posit.co)\n\nLearn more at [forcats.tidyverse.org](https://forcats.tidyverse.org).\n\nUpdated: 2024-05.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\npackageVersion(\"forcats\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] '1.0.0'\n```\n\n\n:::\n:::\n\n\n\n------------------------------------------------------------------------\n",
"supporting": [
"factors_files"
],
Expand Down
Binary file modified _freeze/html/factors/figure-html/unnamed-chunk-21-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified factors.pdf
Binary file not shown.
Binary file modified html/images/logo-forcats.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified keynotes/factors.key
Binary file not shown.
Binary file modified pngs/factors.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit e6596a1

Please sign in to comment.