Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-chunk options support #82

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

In-chunk options support #82

wants to merge 23 commits into from

Conversation

maelle
Copy link
Member

@maelle maelle commented Dec 6, 2022

Fix #55

For now this writes all in-chunk options back to YAML format so if we go with this the README will need a note about this.

@maelle maelle marked this pull request as ready for review December 8, 2022 14:44
@maelle maelle requested a review from zkamvar December 8, 2022 14:44
@maelle maelle changed the title DRAFT in-chunk options support In-chunk options support Dec 8, 2022
R/to_xml.R Outdated

code <- strsplit(xml2::xml_text(code_block), "\n")[[1]]
inchunk_info <- knitr::partition_chunk(info[["language"]], code)
xml2::xml_text(code_block) <- paste0(inchunk_info$code, "\n")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not forget this newline 🔥

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things I'm wondering about 🤔 is it worth it for us to strip these comments from the source?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean? I think it makes sense because the user might change the options via the XML so we need to write the in-chunk options again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I was thinking a situation where the user may want to know how many lines of code they have including options. But you are right. It would be more complex to retain the chunk option formatting in the first place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this go into a feature where we transform code chunks info into a tibble? I mean adding a column with number of code lines.

```{r, message=FALSE}
#| echo: FALSE
#| fig.width: 10
#| fig.cap: This is a long caption.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe that's somewhere where I need to add more YAML-ing as the absence of quote could be problematic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use the yaml package to leverage the yamling because the yaml we will encounter can often be complex and nested:

see https://fosstodon.org/@zkamvar/109423877203504343:

#| fig.alt: 
#| - scatter plot of 100
#|   random points from 
#|   a normal distribution.
#| - approximately normal 
#|   histogram from the same
#|   100 points.
#| - Q-Q plot against the 
#|   normal distribution showing
#|   a near diagonal line.
x <- rnorm(100)
plot(1:100, x)
hist(x)
qqnorm(x)

and https://quarto.org/docs/computations/julia.html#multiple-outputs:

#| label: fig-plots
#| fig-cap: Multiple Plots
#| fig-subcap:
#|   - "Plot 1"
#|   - "Plot 2"
#| layout-ncol: 2

using Plots
display(plot(sin, x -> sin(2x), 0, 2))
display(plot(x -> sin(4x), y -> sin(5y), 0, 2))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is solved? see the julia chunk below

@@ -33,7 +33,7 @@ plot(pressure)

Non-RMarkdown blocks are also considered

```{julia, info=bash}
```{julia}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I am not sure why it used to be "info=bash"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found why:

I was trying to test the integrity of the object (e.g. that it was copied before writing instead of transformed in place). I did not consider the fact that I was overwriting a markdown code block within the chunks:

# One block with info
info_attr <- xml2::xml_attr(blocks, "info")
expect_equal(sum(!is.na(info_attr)), 1)
xml2::xml_set_attr(blocks, "language", "julia")
# save back and have a look
to_md(yaml_xml_list, newmd)
expect_snapshot_file(newmd)
# Still one block with info after writing (the process has not clobbered things)
info_attr <- xml2::xml_attr(blocks, "info")
expect_equal(sum(!is.na(info_attr)), 1)

That being said, I think the behaviour of having the 'info' propagate does actually make sense because it's not unheard of to have custom chunk options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why my code changed this, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zkamvar why was there originally the "info=bash"?

Copy link
Member

@zkamvar zkamvar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you thank you thank you for tackling this! You brought up a good point about what we want the output to be. This PR produces yaml, but we may be able to choose one method and then use knitr::convert_chunk_header() to mix and match after that.

What do you think?

R/to_xml.R Outdated

code <- strsplit(xml2::xml_text(code_block), "\n")[[1]]
inchunk_info <- knitr::partition_chunk(info[["language"]], code)
xml2::xml_text(code_block) <- paste0(inchunk_info$code, "\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things I'm wondering about 🤔 is it worth it for us to strip these comments from the source?

inst/extdata/chunk-options.Rmd Show resolved Hide resolved
R/to_xml.R Outdated
Comment on lines 99 to 100
names(inchunk_options) <- paste0(names(inchunk_options), "-inchunk")
info <- c(info, inchunk_options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this part, we need to include this process of quoting the characters so that we can properly parse them when converting back to YAML or chunk options:

tinkr/R/utils.R

Lines 73 to 84 in 60d3573

# Step 2: find the parameters that are characters because we need to add
# quotes around them (as all parameters are coerced as characters)
are_characters <- purrr::map_lgl(result, is.character)
# Step 3: flatten all params into a character vector
result <- unlist(result)
# Step 4: add quotes around the params that are characters
not_forbidden <- !names(result) %in% c("language", "name")
needs_quoting <- are_characters & not_forbidden
result[needs_quoting] <- shQuote(result[needs_quoting], type = "cmd")

It would be worth splitting this part off into a function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course this will not solve our problems yet because there is also the problem of nested options.

We deal with this in our default chunk options by taking advantage of R syntax. and evaluating the chunk option into an alist of atomic vectors (character, integer, logical, language) that are all happily length 1. From there, we can flatten the list into characters that XML options understand and then quote the characters to indicate that the were characters in their previous lives.

The difficulty is that YAML will evaluate the R expressions for us. Let's take this parsed YAML block from the julia example:

l <- list(
  `label-inchunk` = c(label = "fig-plots"), `fig.cap-inchunk` = "Multiple Plots",
  `fig.subcap-inchunk` = c("Plot 1", "Plot 2"), `fig.alt-inchunk` = c(
    "A blue arc: labelled as y1 that starts at 0,0 and ends at 0.9,-0.75",
    "A blue curve: labelled as y1 twisted like a pretzel"
  ), `layout-ncol-inchunk` = 2L
)
l
#> $`label-inchunk`
#>       label 
#> "fig-plots" 
#> 
#> $`fig.cap-inchunk`
#> [1] "Multiple Plots"
#> 
#> $`fig.subcap-inchunk`
#> [1] "Plot 1" "Plot 2"
#> 
#> $`fig.alt-inchunk`
#> [1] "A blue arc: labelled as y1 that starts at 0,0 and ends at 0.9,-0.75"
#> [2] "A blue curve: labelled as y1 twisted like a pretzel"                  
#> 
#> $`layout-ncol-inchunk`
#> [1] 2

To overcome this, we can use capture.output along with dput to construct an alist and then parse that into a list.

eval(parse(text = paste0("a", paste(capture.output(dput(l)), collapse = ""))))
#> $`label-inchunk`
#> c(label = "fig-plots")
#> 
#> $`fig.cap-inchunk`
#> [1] "Multiple Plots"
#> 
#> $`fig.subcap-inchunk`
#> c("Plot 1", "Plot 2")
#> 
#> $`fig.alt-inchunk`
#> c("A blue arc: labelled as y1 that starts at 0,0 and ends at 0.9,-0.75", 
#>     "A blue curve: labelled as y1 twisted like a pretzel")
#> 
#> $`layout-ncol-inchunk`
#> [1] 2

Created on 2022-12-09 with reprex v2.0.2

With these options, we can then use purrr::map(eval) and then yaml::as.yaml() to produce the yaml string for writing back into the chunk

R/to_xml.R Outdated Show resolved Hide resolved
@maelle
Copy link
Member Author

maelle commented Dec 13, 2022

or, in the XML, we store the inchunk options as... YAML string? 🤔

It might help us with the parsing.
Now it might make users' life a bit more difficult if we don't provide some functionality to go with this?

@maelle
Copy link
Member Author

maelle commented Dec 15, 2022

It'd be nice to have a function operating on code chunks that would present chunk options as a tibble, with for each option a row with

  • name
  • value (list)
  • whether it's defined inside or outside the chunk.

Then there'd be a function to carry edits to this tibble to the code chunk XML.

@zkamvar how would that sound? what would this feature be, code chunks as objects?

inchunk_options <- inchunk_info$options

inchunk_options <- if (!is.null(inchunk_options) > 0) {
yaml::as.yaml(inchunk_options)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now it's a bit annoying that there's such different treatment for in- and out- chunk options, it feels a bit wrong maybe.

Copy link
Member

@zkamvar zkamvar Dec 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will help parse expressions... maybe

Suggested change
yaml::as.yaml(inchunk_options)
yaml::as.yaml(inchunk_options,
handlers = list(expression = function(x) paste("!expr: ", paste(as.character(x), sep = "; "))))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zkamvar could you please explain this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to address expressions that originate from YAML, but I'm now realising that I'm actually having a hard time getting a good example, so I have to rethink this.

(x <- list(A = 1, B = "two", C = str2expression("C <- 1 + 2\nC")))
#> $A
#> [1] 1
#> 
#> $B
#> [1] "two"
#> 
#> $C
#> expression(C <- 1 + 2, C)
eval(x$C) # C evaluates to 3
#> [1] 3
try(yaml::as.yaml(x)) # conversion fails because C is an expression
#> Error in yaml::as.yaml(x) : Unknown emitter error
expr_handler <- function(x) paste("!expr:", paste(as.character(x), collapse = "; "))
yaml::as.yaml(x, handlers = list(expression = expr_handler))
#> [1] "A: 1.0\nB: two\nC: '!expr: C <- 1 + 2; C'\n"

Created on 2023-01-12 with reprex v2.0.2

@maelle
Copy link
Member Author

maelle commented Dec 15, 2022

one might first need a method to get all code chunks? 🤔

@maelle
Copy link
Member Author

maelle commented Dec 15, 2022

In #28 you had mentioned the extraction of features such as code chunks @zkamvar

R/to_xml.R Outdated Show resolved Hide resolved
@zkamvar
Copy link
Member

zkamvar commented Dec 17, 2022

one might first need a method to get all code chunks? 🤔

This is a good point. We've been able to get away with parsing the code chunks because they've always been generally linear and not very complex. With the new nested complexity that YAML introduces, It would definitely make sense for us to have options to store and access the chunk options consistently.

I'm wondering if we should just be opinionated on how code chunks are treated on output (e.g. users can choose to use curly, hashpipe-list, or hashpipe-yaml, but no mixing).

maelle and others added 2 commits January 12, 2023 12:31
@maelle
Copy link
Member Author

maelle commented Jan 13, 2023

@zkamvar what would be a good target for this PR? Happy to do more work on it / follow-up PRs 😁

@zkamvar
Copy link
Member

zkamvar commented Mar 17, 2023

@zkamvar what would be a good target for this PR? Happy to do more work on it / follow-up PRs grin

Ugh, I'm so sorry. We were doing this when things got really hectic for me and I had to drop it. I'm going to come back to this probably after May (that's when we transition all of our lessons)

@maelle
Copy link
Member Author

maelle commented Mar 21, 2023

No worry and good luck with the lessons transition! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for chunk options via knitr 1.35
2 participants