Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Keep data.table chaining #720

Closed
MichaelChirico opened this issue Jan 29, 2021 · 2 comments
Closed

FR: Keep data.table chaining #720

MichaelChirico opened this issue Jan 29, 2021 · 2 comments

Comments

@MichaelChirico
Copy link
Contributor

It is pretty canonical in data.table to write "query chains" like

DT[ , sum(x), by = .(a, b)
   ][ , median(V1), by = a]

But styler doesn't like these:

writeLines("
DT[ , sum(x), by = .(a, b)
   ][ , median(V1), by = a]
", tmp <- tempfile(fileext = ".R"))
styler::style_file(tmp)

writeLines(readLines(tmp))
# DT[, sum(x), by = .(a, b)][, median(V1), by = a]

Would a rule to allow such chaining be in scope for styler?

@lorenzwalthert
Copy link
Collaborator

lorenzwalthert commented Jan 29, 2021

I indeed consider this to be out of scope for styler, as we currently only support the tidyverse style guide and I don't see this formatting compatible with it. However, you could do something like this to create your own style guide.

data.table_style <- function(...) {
  sg <-tidyverse_style(...)
  new_set_line_break_before_closing_call <- function(pd, except_token_before = "COMMENT") {
    if (!is_function_call(pd) && !is_subset_expr(pd)) {
      return(pd)
    }
    npd <- nrow(pd)
    
    #<- this line changed: if token after closing is [ and there is any line break
    # assume it is multiline. in tidyverse_style(), we move the closing brace up
    is_multi_line <- any(pd$lag_newlines[seq2(3L, npd)] > 0) && last(pd$token_after) == "'['" 
    if (!is_multi_line) {
      exception <- which(pd$token_before %in% except_token_before)
      pd$lag_newlines[setdiff(npd, exception)] <- 0L
      return(pd)
    }
    pd$lag_newlines[npd] <- 1L
    pd
  }
  
  sg$line_break$set_line_break_before_closing_call <- new_set_line_break_before_closing_call
  # adapt this for the cache
  sg$style_guide_name <- "styler::tidyverse_style@https://github.com/r-lib" # your url, exported function
  sg$style_guide_version <- '1.3.2.9000' # use your pkg version here
  sg
}

style_text('DT[ , sum(x), by = .(a, b)
   ][ , median(V1), by = a]', style = data.table_style)
#> DT[, sum(x), by = .(a, b)
#> ][, median(V1), by = a] 

Maybe you want to export style_text() and friends too and make them default to data.table_style so you can omit the argument above.

The people from mlr(3) also have their own style guide documented here: https://github.com/mlr-org/mlr3/wiki/Style-Guide. Although I'd probably prefer now the name of the fork package not to be styler (so you can have two installations), but rather something like styler.mlr. I can assist you if you want to do something similar for data.table, but I think it's better to not clone all of styler for that, just make a very slick package with the code above and maybe add some unit tests (best case: all styler unit tests and adapt to the rules you change).

In addition, there is some (outdated) discussion of other style guides elsewhere, in particular in this comment: #340 (comment)

@wagathu
Copy link

wagathu commented Jan 19, 2024

When running the above code, I get:

Error in is_subset_expr(pd) : could not find function "is_subset_expr"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants