Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add recursive function to find currency dollars #123

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
69 changes: 66 additions & 3 deletions R/asis-nodes.R
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ protect_inline_math <- function(body, ns) {
}

# protect math that is broken across lines or markdown elements
if (length(bmath)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use an explaining variable although that is already clear

if (length(bmath) > 0) {
if (any(broke$ambiguous)) {
# ambiguous math may be due to inline r code that produces an answer:
# $R^2 = `r runif(1)`$
Expand All @@ -140,15 +140,26 @@ protect_inline_math <- function(body, ns) {
)
headless <- headless | has_inline_code
}
# If the lengths of the beginning and ending tags don't match, we throw
# an error.
le <- length(bmath[endless])
lh <- length(bmath[headless])
# 2024-10-10: if the number of headless OR endless tags is zero, then we
# are dealing with currency. See issue #121 and #124
if (lh == 0 || le == 0) {
return(copy_xml(body))
}
# 2024-10-15: if the number of headless tags is _less_ than the number of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 2024-10-15: if the number of headless tags is _less_ than the number of
# 2024-10-15: if the number of startless tags is _less_ than the number of

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the variable is actually headless, but I wonder if it might be confusing

# endless tags, then we _might_ be dealing with currency and should try to
# trim them out.
if (le > lh) {
trm <- remove_money(bmath, endless, headless)
bmath <- trm$bmath
endless <- trm$endless
headless <- trm$headless
lh <- trm$lh
le <- trm$le
}
# If the lengths of the beginning and ending tags don't match, we throw
# an error.
if (le != lh) {
unbalanced_math_error(bmath, endless, headless, le, lh)
}
Expand All @@ -163,6 +174,58 @@ protect_inline_math <- function(body, ns) {
copy_xml(body)
}

# remove the non-math from the math.
remove_money <- function(bmath, endless, headless) {
actual_math <- toss_broken_teeth(endless, headless)
bmath <- bmath[actual_math]
endless <- endless[actual_math]
headless <- headless[actual_math]
lh <- length(bmath[headless])
le <- length(bmath[endless])
return(list(
bmath = bmath,
endless = endless,
headless = headless,
lh = lh,
le = le
))
}

# Imagine a zipper. Every tooth fits in a groove directly opposite. If there
# is a mistake and there is an extra tooth on one side, the zipper gets stuck.
# the solution is to remove that tooth to realign the zipper.
#
# This function takes two logical vectors assuming the following:
#
# 1. endless starts with TRUE
# 2. headless ends with TRUE
# 3. endless and headless are the same length
# @param endless [logical] vector indicating broken math elements that
# have no ending pair
# @param headless [logical] vector of the same length as `endless` indicating
# broken math elements that have no opening pair.
toss_broken_teeth <- function(endless, headless) {
if (length(endless) < 2) {
# EXIT CASE -----------------------------------------------
# less than 2 either returns FALSE or logical(0)
return(logical(length(endless)) == 2)
} else if (endless[1] == headless[2]) {
# CASE 1: MATCHING PAIRS ----------------------------------
# When the pairs match, these are likely broken math
# and we increment by two to move to the next pair
result <- c(TRUE, TRUE)
idx <- -(1:2)
} else {
# CASE 2: MISMATCHED PAIRS --------------------------------
# When the pairs do not match, it's not likely broken math,
# so we toss it and move to the next element.
result <- FALSE
idx <- -1
}
# return the result and iterate over the rest of the vector
return(c(result, toss_broken_teeth(endless[idx], headless[idx])))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use devtag https://github.com/moodymudskipper/devtag

and have examples


# Partial inline math are math elements that are not entirely embedded in a
# single `<text>` element. There are two reasons for this:
#
Expand Down
2 changes: 1 addition & 1 deletion man/isolate_nodes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/provision_isolation.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

40 changes: 13 additions & 27 deletions man/rmd-fragments/format-latex.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -19,31 +19,17 @@ Note, however, that there are a few caveats for this:
1. The dollar notation for inline math must be adjacent to the text. E.G.
`$\alpha$` is valid, but `$ \alpha$` and `$\alpha $` are not valid.
2. We do not currently have support for bracket notation
3. If you use a postfix dollar sign in your prose (e.g. BASIC commands or a Burroughs-Wheeler Transformation demonstration),
you must be sure to either use punctuation after the trailing dollar sign OR format the text as code.
(i.e. `` `INKEY$` `` is good, but `INKEY$` by itself is not good and will be interpreted as LaTeX code, throwing an error:
```{r, basic, error = TRUE}
path <- system.file("extdata", "basic-math.md", package = "tinkr")
math <- tinkr::yarn$new(path)
math$head(15) # malformed
math$protect_math() #error
```
4. **use of `$` as currency will still work**, but there is a caveat that mixing
this inline math broken across lines will cause problems:
```{r, split, error = TRUE}
# this will be mal-formed
bad <- "It's 5:45 and I've got $5.45 in my pocket.\nThe __area of a circle__ is $A =\n \\pi r^2$, where\n$\\pi$ is irrational when it hasn't had its coffee."
fails <- tinkr::yarn$new(textConnection(bad))
fails$show()
fails$
protect_math()$
show()
# This works
good <- "It's 5:45 and I've got $5.45 in my pocket.\nThe __area of a circle__ is $A = \\pi r^2$, where\n$\\pi$ is irrational when it hasn't had its coffee."
works <- tinkr::yarn$new(textConnection(good))
works$show()
works$
protect_math()$
show()
3. Use of `$` as currency will still work. There are a few caveats:
1. The search for math is greedy, so currency may be extracted along with
the math unless a markdown element separates them:
```{r}
# no separator, math is detected at the beginning of the line.
ex <- textConnection("$5 for a $\\pi$")
tinkr::yarn$new(ex)$protect_math()$get_protected()
# With an emph separator
ex <- textConnection("$5 for _any_ $\\pi$")
tinkr::yarn$new(ex)$protect_math()$get_protected()
```
2. Postfix dollar signs (e.g. using BASIC code demostrations) should be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Postfix dollar signs (e.g. using BASIC code demostrations) should be
2. Postfix dollar signs (e.g. using BASIC code demonstrations) should be

protected with backtics, or it will cause an error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protected with backtics, or it will cause an error.
protected with backticks, or it will cause an error.


```
68 changes: 62 additions & 6 deletions tests/testthat/_snaps/asis-nodes.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,72 @@
# mal-formed inline math throws an informative error
# (#121) money dollars mixed with broken math don't break

Code
show_user(dollar_math$show(), force = TRUE)
Output
---
title: "Dollar as currency mixed with math"
---

## currency and ambiguous math

I've got $5.45 and $A = `r runif(1)`$

## currency and math on the same line

Math that starts on the dollar: I've got $5.45 and $A = \pi r^2$.

Math that starts at the equation: I've got $5.45 *and* $A = \pi r^2$.

## broken math trailing currency

The area of a circle is $A =
\pi r^2$.

## Currency preceding inline broken math (due to emph)

It's 5:45 and I've got $5.45.

Below is an example from <https://github.com/ropensci/tinkr/issues/38>
$\frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}}$

The area of circle is $A _i = \pi r_ i^2$.

$P =
NP$

## Currency with inline broken math (due to emph)

The following line is considered to *all* be math:
$199 and $\frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}}$.

The following line has math starting at the correct place:
$199 *and* $\frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}}$.

The following line correctly protects math because of the link
There is a $1 bet on [wikipedia](https://wikipedia.org) that the area of circle is $A _i = \pi r_ i^2$.

$99
$9

$P \ne
NP$

## Trailing currency does not affect the computation.

Dang a whopper and a 40 cost $10 now, but I've only got $5.45.


# postfix dollars throws an informative error

Inline math delimiters are not balanced.

HINT: If you are writing BASIC code, make sure you wrap variable
names and code in backtics like so: `INKEY$`.

Below are the pairs that were found:
start...end
-----...---
Give you $2 to ... me what INKEY$ means.
Give you $2 to ... 2$ verbally.
We write $2 but ...
start...end
-----...---
...INKEY$

# block math can be protected

Expand Down
53 changes: 53 additions & 0 deletions tests/testthat/examples/math-money-mix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
title: "Dollar as currency mixed with math"
---


## currency and ambiguous math

I've got $5.45 and $A = `r runif(1)`$

## currency and math on the same line

Math that starts on the dollar: I've got $5.45 and $A = \pi r^2$.

Math that starts at the equation: I've got $5.45 _and_ $A = \pi r^2$.

## broken math trailing currency

The area of a circle is $A =
\pi r^2$.

## Currency preceding inline broken math (due to emph)

It's 5:45 and I've got $5.45.

Below is an example from https://github.com/ropensci/tinkr/issues/38
$\frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}}$

The area of circle is $A _i = \pi r_ i^2$.

$P =
NP$

## Currency with inline broken math (due to emph)


The following line is considered to _all_ be math:
$199 and $\frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}}$.

The following line has math starting at the correct place:
$199 _and_ $\frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}}$.

The following line correctly protects math because of the link
There is a $1 bet on [wikipedia](https://wikipedia.org) that the area of circle is $A _i = \pi r_ i^2$.

$99
$9

$P \ne
NP$

## Trailing currency does not affect the computation.

Dang a whopper and a 40 cost $10 now, but I've only got $5.45.
15 changes: 15 additions & 0 deletions tests/testthat/test-asis-nodes.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,21 @@ test_that("(#121) single dollar lines dont throw errors", {
expect_equal(actual, expected)
})


test_that("(#121) money dollars mixed with broken math don't break", {
okay <- test_path("examples", "math-money-mix.md")
dollar_math <- yarn$new(okay, sourcepos = TRUE)
expect_no_error(dollar_math$protect_math())
expect_snapshot(show_user(dollar_math$show(), force = TRUE))
})

test_that("postfix dollars throws an informative error", {
expected <- "INKEY$\n"
math <- commonmark::markdown_xml(expected)
txt <- xml2::read_xml(math)
expect_snapshot_error(protxt <- protect_inline_math(txt, md_ns()))
})

test_that("(#124) french dollar lines dont throw errors", {
expected <- "I've only got 2$ in the bank. Feels bad, man. Feels bad to not have 2 $\n"
math <- commonmark::markdown_xml(expected)
Expand Down
Loading