-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The SAW Documentation Revolution (Part 1) #2202
base: master
Are you sure you want to change the base?
Conversation
A few of the Markdown files that sat in doc/ are not intended for public consumption (i.e. deployment to HTML/PDF versions of documentation). These include: - doc/extcore.md -> doc/internal/extcore.md This is documentation for the SAWCore dump format, which is (hopefully) not something typical users are having to mess with. - doc/limitations.md -> doc/internal/limitations.md A five-year-old attempt at documenting the limitations of SAW's use for verification. It should be updated or scrapped. - doc/RELEASING.md -> doc/internal/releasing.md A (partially?) outdated documentation of SAW's release process. This needs some updating as well.
None of the files in doc/manual/code were referenced or used anywhere in doc/manual, and at least one of these files made reference to something completely nonexistent (see #2185).
One was ill-formed (links based on headers are written with a '#' followed by a normalization of the header text), and another was completely non-existent / referred to a section that had been renamed at some point in the past.
This document is (in my opinion) fundamentally ill-formed, as there are multiple level-1 headings. Before fixing that, however, we might as well use the more-correct incorrect thing.
I'm not entirely sure where the syntax `{.c}`/`{.rs}`/etc. came from, but it was not recognized by the Sphinx Markdown engine while testing locally, so I've used the conventional syntax everywhere instead. This change doesn't break the existing PDF builds, either, so perhaps it has to do with how the old website pages were built?
This temporarily makes this Markdown violate the "thou shalt only have one <h1>" rule, however: Logically, we want this part of the SAW documentation to be split into chapters, similar to the LLVM/Java tutorial. Before, this part of the documentation rendered to PDF (via Sphinx) as one giant chapter.
Specifically, to 'Introduction'. Consistency is good!
Not in that order :) In an effort to start standardizing the Markdown (e.g. removing multiple top-level headers), this commit primarily restructures the LLVM/Java tutorial into multiple source files. This additionally makes individual parts of the tutorial easier to maintain. A similar change will be implemented for the manual and Rust tutorial for the same reason. This restructuring naturally broke the original PDF generation; fortunately, the eventual introduction of Sphinx will give us both HTML and PDF generation for free, and this generation strategy is no longer necessary to maintain. Note that this commit does _not_ perform any changes to the content of the tutorial or indeed any other structural property of the Markdown. Additional changes (e.g. standardizing the Markdown syntax used) will be the subject of future commits.
This is part of a larger effort to standardize the Markdown sources for SAW's various pieces of documentation.
Namely: Fenced with ``` ... ```, no extra space between the fence marker and the language specifier, and _always_ labeled with a language specifier.
Because we are using Sphinx/MyST, there is no need to use the docode.hs script to insert code fragments, as we have the literalinclude directive. tutorial/using-smt-lib-solvers: Repair broken code references. tutorial/compositional-proofs.md: Repair broken code references. tutorial/interactive-interpreter: Repair broken code references. tutorial/other-examples: Repair broken code references. tutorial: Remove docode.hs.
And delete the PNG version, as the PDF will suffice to add the Galois logo to PDF documentation.
A much more descriptive name :)
This prevents rendering issues, and is more consistent with the directive mechanism. It also allows for the use of 'admonitions', which may be helpful in improving the documentation.
See 0df6156 (this is analogous).
- Mismatched inline code elements - Code blocks without whitespace in between - Incorrect link (introduced by splitting up rust-tutorial.md)
- ::: fences - {literalinclude} with :lines: for code/ sources (docode.hs defunct) - {code-block} <lang> for everything else
- Invalid whitespace around headings / code blocks - Bare URLs - Incorrect emphasis notation (** instead of __) - Invalid links to other pages/sections
- Code blocks (::: fences) - __ for emphasis instead of ** - '-' instead of '*' for unordered lists - Unordered -> Ordered (and vice versa) where appropriate - Fix broken link(s)
For consistency with the actual title of this resource.
Analogous to dba1454 - more consistent with the title of the book whose sources are contained within.
These documents may not be intended for most users, but they ought to be available to read in a nicer format than Markdown files in the repository. Since we control which documents are rendered to PDF (i.e. when we `make latexpdf`) explicitly, these developer documents will only be rendered as part of the HTML (that will eventually be our GitHub Pages site for saw-script). This change reflects that these are documents related to the development of SAW (rather than generically 'internal'). Additional restructuring will follow in additional commits.
This maintains the original internal linking of the affected documents while using explicit _reference labels_ rather than links based on document names / headers. In addition to removing dependence on references generated from headers, which are subject to change during content editing (thereby breaking links), reference labels allow all documents (eventually managed by Sphinx) to straightforwardly cross-reference one another, and are one of the documentation engine's most compelling features. There is, however, a catch: For Sphinx to correctly render cross-referenced documents, reference labels must be unique _to the entirety of the documentation_. In other words, the SAW User Manual and Rust Verification with SAW tutorial could not both define a reference label `my-reference`. There is no local scoping mechanism for reference labels, so there is necessarily a tradeoff between creating a highly-connected, wiki-like documentation (where items are given reference labels more often than not), and having a massive namespace of reference labels to manage. This is something to consider for the future, but certainly not our biggest concern regarding documentation at the moment.
And normalize the Markdown formatting in the process. This is analogous to earlier commits relating to the manual and tutorials (but simpler, since there was no existing infrastructure around this document).
Sphinx will take care of it!
And by 'fix', I mean: Add a document containing a verbatim rendering of the 'real' `CHANGES.md` at the SAWScript repository root, and link to that. An alternative that was considered that would have resulted in the 'real' `CHANGES.md` being rendered directly was the `{include}` directive. Not only is it the case that "the include directive represents a potential security hole"[^1], Sphinx will attempt (and fail) to resolve the links within, which will fail fantastically since most of them are outside of the doc/ root. [^1]: https://docutils.sourceforge.io/docs/ref/rst/directives.html#include
It will be a convenient place to link to from the index of the SAWCore external format description in the developer materials.
To clarify that `make_docs.sh` handles setting up the Python environment, this commit adds a link to the section of the README about the `setup_env.sh` script. There is also a reiteration of the fact that a Python installation >=3.9 is required.
Since we're still working on things, there's no good reason to upstream this to `pygments`, and thusly no good reason to mention it in the README. Additionally, using Markdown's absurdity, I reworded the description of `saw-lexer` to properly refer to how code blocks need to be labeled for the SAWScript lexer to be applied for highlighting (the absurdity being the need for _four_ backticks on either side, in order to render three backticks within in the code block).
So that it's clearer that the exclusion of the index document means exclusion of its title, and that titles for PDFs are configured when adding a new entry to the list of PDFs to generate.
First, my comment was inaccurate -- there is a big table of builtins in the interpreter (thanks @sauclovian-g for pointing this out). As he rightfully noted, one day we will like to have these / their docstrings generate an appendix for the user manual. Second, other than looking a bit fancier in the render, there's not too much reason to give special treatment to the builtin functions and how they render; so, since we already plan to generate documentation of the builtins eventually anyway, we can let them be lexed as regular identifiers (and in the future add references to the appendix).
We're going to hide developer docs from Sphinx anyway, but this essentially puts things back to how they were before this PR (with the addition of an extra '..' to get to the right place. Essentially, this change reflects our desire for doc/developer/releasing.md to be readable on GH, and properly link to the change log at the repository root.
As originally planned, this commit hides the development/ directory from the Sphinx build, and removes all links to it from the document tree. See #2202 (comment) for relevant conversation on the matter of publishing developer documentation. TL;DR: For now, there is nothing we want to publish (extcore is not something intended for broader use than loading/storing SAW Terms, the release process is only relevant to developers, and the limitations document is old / needs review before being added to user-facing documentation, at which time it would likely be best added to the manual instead of as a separate resource).
And, in the doc/README, move the description of the directory to a more appropriate section of the file.
And, notably, we don't need bash, so use /bin/sh instead. This allows the script to be run from anywhere, as long as the requirements.txt file stays adjacent to it in the filesystem. This is documented in doc/README.
And move setup_env.sh, make_docs.sh, and requirements.txt, which cleans up the doc/ root significantly. Includes updated documentation of the file structure in doc/README.
The use of `make` is ubiquitous to Sphinx builds, and it was rather inappropriate to introduce scripts that invoked it to build the documentation.
This restores one functionality of the old make_docs.sh, namely the packaging of code example directories.
Thank you to @sauclovian-g for suggesting this rearrangement - not only is it "more correct", it resulted in smaller, more reusable scripts and a more intuitive build process. That's great! First, we explicitly list SUPPORTED_DOCS (that is, the targets Sphinx's old Makefile understood that we wish to continue supporting). This allows us to (1) always show up-to-date help text documenting what someone can run to build the docs, (2) easily add additional format support, and (3) replace the old catch-all target with a more helpful suggestion given these changes. We replace Sphinx's help with a series of messages listing available targets, but still leave Sphinx's original help available (this is mostly for developers looking to add support for additional Sphinx targets) via target sphinx-help. Targets setup-env and package-code are methods of executing the scripts in scripts/. These are also the dependencies of the document-rendering targets, and anything related to running Sphinx. mostlyclean and clean respectively clean up the documentation build (including package code examples) and all of that plus the Python environment. install-pdf copies rendered PDFs from the Sphinx output directory to doc/pdfs, building them if necessary.
To eventually validate that the 'correct' PDFs have been checked in, this commit adds a mechanism to control the reproduction of SAW documentation PDFs: The SOURCE_DATE_EPOCH environment variable (see https://reproducible-builds.org/docs/source-date-epoch/). We set this in a .env file, and use if in the Makefile on documentation builds so the last edit time is consistent. For now, this value is set to (GMT) Tuesday, April 1, 2025 12:00:00 AM, but for releases should be set to an epoch timestamp corresponding to the date of release (for example). Release instructions for the repository have been updated to reflect the need to update this variable at release time (and correspondingly, the documentation PDFs themselves).
In particular: - Explicitly note that `make` is required to build documentation - Replace the description of `make_docs.sh` with a description of the `Makefile` and its targets, as well as the `scripts/` directory and contents - Update troubleshooting recommendations to match - Update descriptions of the `doc/` directory contents: - In particular, remove note about removing PDFs from the repository. This was discussed out-of-band, and we aligned on leaving the PDFs around for developer benefit while browsing the repository - Remove the no-longer-necessary disclaimer about packaging code
This reduces the 'escaped' dollar signs and produces more sensible output (i.e. the output shows the result of expansion).
This Python package implements a new Sphinx role, download-dir, that can be used to indicate that a referenced directory should be archived and made downloadable. For example: If we write {download-dir}`/my/cool/dir`, the role attempts to create `$SOURCEDIR/my/cool/dir.tar.gz` from the contents of `$SOURCEDIR/my/cool/dir`, where $SOURCEDIR is the Sphinx project root (for us, `saw-script/doc/`). In short, this means that you can only create downloads out of directories somewhere under in `doc/`, including `doc/` itself. As is Sphinx's convention, then, paths starting with `/` indicate that the path is rooted at $SOURCEDIR, and relative paths are relative to the document where the role was used. The role also supports {download-dir}`links with explicit titles <thing/to/download>`; when no title is provided, the link will be the name of the archive (i.e. the stem of the target directory + ".tar.gz"). This is an improvement over our current solution, which is inflexible (requiring that only specific directories be used as download targets) and doesn't fully utilize Sphinx's/docutils's extensible role framework (instead using scripts).
And, remove a spurious comment. This makes sphinx-download-dir available in the build environment created by scripts/setup_env.sh.
This will keep workspaces clean when building locally, and encourages handling the archiving process via sphinx-download-dir. There is currently one exception in the Rust tutorial sample code, where a tar.gz of a large JSON file is provided. Of course, one can use git add -f to commit additional archives as necessary.
This replaces uses of MyST's <path:...> link format used to generate download links to individual files with uses of the role introduced by the sphinx-download-dir package.
Given that we now use sphinx-download-dir (which handles the archiving of code samples during the Sphinx build), we have no need for the packaging script / make target. Furthermore, we make the mostlyclean target interactive in its deletion of tar.gz archives, to prevent unintentionally deleting a checked-in archive that is part of a sample directory (the Rust tutorial is a case like this). This is mostly necessary since we no longer force the sample directory / archive to have a specific name, and simply attempt to clean up _all_ tar.gz archives in and under `doc/`.
Mostly focuses on the effects on users (namely, that the PDFs they get with releases have changed a bit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I believe I addressed all of the review comments, and in addition I was able to further clean things up by replacing the somewhat hacked-together code sample packaging with a flexible Sphinx role.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more minutiae, and there's one leftover from the last batch, but in general looks good. Thanks for taking care of all this!
@@ -49,16 +49,6 @@ class SAWScriptLexer(RegexLexer): | |||
bygroups(Keyword.Type, Whitespace), | |||
), | |||
(r"(true|false|abc|z3)(\s+)", bygroups(Keyword.Constant, Whitespace)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not to kibitz further or anything 😸 but abc
and z3
should probably get dropped too.
(or add the other solvers, but the full list is pretty large... there's eight each with four variant calls, plus or minus, plus a few addenda)
I think what we should do in the long run is either have saw dump out info from the primitives table, or generate both this and the primitives table from an external definition file. But right now the primitives table doesn't have enough info to make this worthwhile.
(Also what I should probably do at some point is run pygments manually on a bunch of existing proof code and fiddle with the coloring until it looks nice. FUTURE...)
$(info Tidy everything, including Python venv: 'make clean') | ||
|
||
sphinx-help: setup-env | ||
@. .venv/bin/activate && $(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you wrap this (and the other one) in parens so it runs in a subshell?
That is,
@(. ./venv/bin/activate && ...)
The reason is that the lifetime of the shells spawned by make is unspecified, and while it's de facto ok (especially with gmake) ... the extra fork doesn't cost anything important, and accidentally leaking the venv into some other command that wasn't supposed to have it could end up extremely confusing.
@@ -0,0 +1 @@ | |||
SOURCE_DATE_EPOCH=1743465600 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put a comment here indicating what it expands to?
(Also it might be nice to note that you can generate the stamp with date +%s -d 20250401
)
(although apparently this works on BSD and Linux but not MacOS)
(and it might be better to put this in scripts/epoch.mk or something like that)
This can be used to provide downloads in the HTML rendering of the | ||
documentation (see [below](#code-examples)). | ||
- `install-pdf`: Build and install PDF renderings to `doc/pdfs/`. | ||
- `mostlyclean`: Clear out all packaged code and Sphinx-generated files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference for these two would be clean
and distclean
respectively, since the general convention is that clean
removes build products and distclean
also removes configuration. FWIW, may not matter...
@@ -2,4 +2,4 @@ __pycache__ | |||
*.egg-info | |||
|
|||
_build | |||
code.tar.gz | |||
*.tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears .venv didn't get in here
pdf: latexpdf | ||
|
||
install-pdf: latexpdf | ||
mkdir -p pdfs | ||
cp -f $(shell ls _build/latex/*.pdf | sed '/\/galois\.pdf$$/d') pdfs | ||
|
||
mostlyclean: | ||
rm -i $(shell find . -path ./$(BUILDDIR) -prune -o -name '*.tar.gz' -print) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is desirable (for one thing the CI ought to be able to run this; even if we don't need it now it's bound to come up eventually...)
But also, if you put *.tar.gz in .gitignore, git will get all upset and/or confused about having one checked in.
The one preexisting .tar.gz file contains a single file and could just be a .gz file instead. (Unless there's some specific reason to also use tar. @RyanGlScott?) Alternatively we could put the generated downloads somewhere else, e.g. in their own downloads/ subdir.
every Python under the sun). | ||
|
||
This is enough to generate the HTML versions of documentation, which is what is | ||
deployed to <https://galoisinc.github.io/saw-script>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this accidentally got dropped again; the current version of the README doesn't have it. Can you stick something like "By default the build will configure and download a Python virtual environment in .venv; to set this up explicitly beforehand, run make setup-env" at the bottom of ### Prerequisites
?
Overview
This is a pretty large PR, but is focused entirely on the contents and structure of
doc
.TL;DR for reviewers: The following commits/commit ranges correspond to relatively high-level changes to SAW's documentation structure:
Moves 'internal' documentation out of the way. A temporary measure modified later to include this documentation in the overall hierarchy.
Initial Markdown normalization and cleanup of
doc/
's stale files.Refactors the LLVM/Java tutorial into multiple Markdown sources, so maintenance is easier.
Similar to the above, but for the Rust tutorial
Ditto for the SAW User Manual.
The tutorials/manual are given more memorable names.
Rename 'internal' to 'development', and structure these documents to be included in the overall hierarchy.
Adds all of the necessary
index.md
files to tie the documentation together.Implements a
pygments
lexer for SAWScript (see full commit message for details - this can be removed eventually).Prepares
doc/
for a Sphinx installation.Add helper scripts to
doc/
to build the documentation.Add newly-rendered PDFs of existing resources.
Document the documents :)
TODOs not covered in this PR
Resolving pointers to
doc/
that exist elsewhere insaw-script
as appropriateMoving other existing docs to
doc/
(see the README for details)Deploying documentation to the GitHub pages site for
saw-script
in CIThe Cryptol setup to get this working with multiple versions is... complicated.
I think we can adapt this SO answer for our purposes - it's far simpler than what Cryptol does currently, and should probably be considered as a solution in that context as well.
Replacing in-repo PDFs with appropriate CI/Action artifacts
Related issues
java_symexec
in doc/manual/code #2185.