Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(render): Rerender also when child files changes #158

Closed
wants to merge 2 commits into from

Conversation

mutlusun
Copy link
Contributor

@mutlusun mutlusun commented Nov 25, 2023

Prework

Related GitHub issues and pull requests

  • Ref: None so far.

Summary

Please explain the purpose and scope of your contribution.

Currently, a knitr/rmarkdown report is only rerun if a) dependent targets change, b) output files in the e.g. main_files directory (created by knitr if the output file is called main), or c) the main (source) file of the report changes. However, if the text of a child document changes, a rerun is also desired.

Thus, the current commit adds the child files of a report to the return value and thus, checking these files for changes.

Remarks

  • Sorry for not opening an issue earlier. I hope we can discuss the problem here as well.
  • The added code does not look very pretty. It could be simplified using the stringr package but I didn't want to add an extra dependency for such a small change. I also used the |> and \(x) operators here. I don't mind to change these if you don't like them. I'm also happy about any changes and improvements you propose.
  • With the current approach, the report is rendered twice by knitr_lines what is probably not desired. However, I think the current changes were the simplest one even though an overhead is added.
  • With the current approach, source(file.R) commands are not detected.
  • In tar_render_paths the return files are sorted, output and source as well. However output and source should be of length 1. Thus, is it safe to omit the sort function?
  • I hope you had a good vacation :)

Thanks for your feedback!

Currently, a knitr/rmarkdown report is only rerun if a) dependent
targets change, b) output files in the e.g. main_files directory
(created by knitr), or c) the main file of the report changes. However,
if the text of a child document changes, a rerun is also desired.

Thus, the current commit adds the child files of a report to the return
value and thus, checking these files for changes.
@wlandau
Copy link
Member

wlandau commented Nov 29, 2023

I am open to including the sources of child documents, but as you have seen, it can be hard to detect them. I think it would be better to knitr::knit(tangle = TRUE) with a knit_hooks$set(chunk = ...) hook which can look at each set of chunk options and record the result of each child option. The function could look something like knitr_lines(), but with a hook and with a different intent.

Also, it would be great to try to avoid modifying the tests more than needed. I think we only need one child file to test this.

We would also need to make the same changes for tar_quarto() and tar_quarto_rep().

@mutlusun
Copy link
Contributor Author

mutlusun commented Dec 4, 2023

Thank you for your feedback! I'm sorry, it will take me some days to work again on this PR.

@wlandau
Copy link
Member

wlandau commented Dec 4, 2023

No worries.

In fact, I hope you didn't get too far ahead because I no longer think it is feasible for tarchetypes to become aware of literate programming child documents. There are two main reasons for this. One is that child documents are not always specified through code chunk options, they can also be run using knitr::knit_child(), and it becomes far less feasible to detect those sorts of calls (e.g. if knitr::knit_child() is nested within a user-defined function somewhere). And for Quarto projects, tarchetypes relies on quarto::quarto_inspect() to get all its information about input/output files. Literate programming file management is so messy and variable that it is not feasible to add on to what quarto::quarto_inspect() already provides. I think it is more feasible to claim in the docs that literate programming functionality does not explicitly support child documents. That way it is easier for users to have accurate expectations.

@wlandau wlandau closed this Dec 4, 2023
@mutlusun
Copy link
Contributor Author

mutlusun commented Dec 5, 2023

I agree with you that is hard for tarchetypes to detect child documents. I haven't finished yet, because I would need to handle the child chunk option, the knitr::knit_child() function call, and probably also any source() function calls. Doing all three things in a clean solutions needs some work. As you pointed out, nested inclusions in functions or dynamically loading child documents would not be detected.

However, I think that the detection of these three inclusions (child chunk option, knit_child(), source()) would handle almost all regular uses of child documents. Thus, I think it's still worthwile to try to detect the inclusions of child documents on this way (and warn the user that not all edge cases are covered). If you are still interested, I would be happy to implement this. Nonetheless, I can also understand if you don't want to include a solution that doesn't handle all cases.

As an alternative, it could be an option to manually add child documents as a dependency to the tar_render targets. Could this also be an option? Would you be interested in such an implementation?

In my analysis, I have always a main rmarkdown document that includes several child documents. Currently, it is very annoying to invalidate the tar_render target and rerun this target again (even though I automated this with a justfile/makefile).

Thanks again for your help and considering these options.

@wlandau
Copy link
Member

wlandau commented Dec 5, 2023

I think it would be best if the rmarkdown and quarto packages could transparently list all of the input and output files used. These packages are in the best position to accurately disclose their dependencies and outputs. quarto::quarto_inspect() tries to do this to some extent, but IIRC there are edge cases like this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants