Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nodes that mark the content of string literals #155

Conversation

MichaelHatherly
Copy link
Contributor

This adds additional nodes within string literals and command literals that marks the actual contents of the string itself. This is useful for writing queries that need to do something just with the contents, rather than the entire literal including the quotes, e.g. language injections in Neovim.

Previously you would have to do some #offset! calls within the queries that perform the language injections. Now you can just match against the string_content and raw_string_content nodes directly and avoid having to compute offsets manually.


Some TODOs:

  • naming, I've gone with what seemed reasonable: string_content and raw_string_content. I couldn't come up with a nice way to unify the node types. Open to suggestions there.
  • must I commit a tree-sitter generate to this PR, or is that done separately before a tag is done?
  • even though this has touched a bunch of test files I'll still be adding some additional ones for some more edge cases if the general approach taken in the implementation is suitable.

This adds additional nodes within string literals and command literals that
marks the actual contents of the string itself. This is useful for writing
queries that need to do something just with the contents, rather than the
entire literal including the quotes, e.g. language injections in Neovim.

Previously you would have to do some `#offset!` calls within the queries that
perform the language injections. Now you can just match against the
`string_content` and `raw_string_content` nodes directly and avoid having to
compute offsets manually.
@savq
Copy link
Collaborator

savq commented Oct 13, 2024

Thanks for the PR.

I worked on refactoring string parsing in 803b97f, mainly to fix a couple of bugs, but also to remove the serialization stuff from the scanner. The newer version would conflict somewhat with your suggested string_content rule.

I'm wondering if it'd be enough to alias all the _content_* rules to a visible rule. Since most injections are for macro strings that don't have interpolations.

must I commit a tree-sitter generate to this PR, or is that done separately before a tag is done?

Yes, usually each PR should include its generated parser.c (that's why some PRs are so big). Also, I haven't merged #153 because my homebrew install of tree-sitter got messed up somehow 😖 but I'll fix that next week.

@MichaelHatherly
Copy link
Contributor Author

The newer version would conflict somewhat with your suggested

Feel free then to just pull in what changes you would want from this branch 👍

I'm wondering if it'd be enough to alias all the content* rules to a visible rule. Since most injections are for macro strings that don't have interpolations.

Yeah, that might be enough, so long as it allows some way to get at just the string contents that'll be fine.

@savq savq mentioned this pull request Nov 5, 2024
@savq savq closed this in #153 Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants