Skip to content

Commit

Permalink
Data transformation sheet updates, addresses #302
Browse files Browse the repository at this point in the history
  • Loading branch information
mine-cetinkaya-rundel committed Jul 23, 2023
1 parent a05038f commit e7d7ed2
Show file tree
Hide file tree
Showing 5 changed files with 19 additions and 11 deletions.
8 changes: 5 additions & 3 deletions _freeze/html/data-import/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/html/data-transformation/execute-results/html.json

Large diffs are not rendered by default.

Binary file modified data-transformation.pdf
Binary file not shown.
18 changes: 12 additions & 6 deletions html/data-transformation.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ execute:
eval: true
output: false
warning: false
editor_options:
chunk_output_type: console
---

```{r}
Expand Down Expand Up @@ -57,7 +59,7 @@ Summary functions take vectors as input and return one value back (see Summary F
Also `tally()`, `add_count()`, and `add_tally()`.

```{r}
mtcars |> summarize(cyl)
mtcars |> count(cyl)
```

## Group Cases
Expand Down Expand Up @@ -420,22 +422,26 @@ Use a **"Nest Join"** to inner join one table to another into a nested data fram

### Column Matching for Joins

- Use `by = c("col1", "col2", ...)` to specify one or more common columns to match on.
- Use `by = join_by(col1, col2, …)` to specify one or more common columns to match on.

```{r}
left_join(x, y, by = "A")
left_join(x, y, by = join_by(A))
left_join(x, y, by = join_by(A, B))
```

- Use a named vector, `by = c("col1" = "col2")`, to match on columns that have different names in each table.
```{=html}
<!-- -->
```
- Use a logical statement, `by = join_by(col1 == col2)`, to match on columns that have different names in each table.

```{r}
left_join(x, y, by = c("C" = "D"))
left_join(x, y, by = join_by(C == D))
```

- Use `suffix` to specify the suffix to give to unmatched columns that have the same name in both tables.

```{r}
left_join(x, y, by = c("C" = "D"), suffix = c("1", "2"))
left_join(x, y, by = join_by(C == D), suffix = c("1", "2"))
```

### Set Operations
Expand Down
Binary file modified html/images/logo-dplyr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit e7d7ed2

Please sign in to comment.