-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: better structured headings #134
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,9 @@ | |
// - Rule Order: Tree-sitter will prefer the token that appears earlier in the | ||
// grammar. | ||
// | ||
// https://tree-sitter.github.io/tree-sitter/creating-parsers | ||
// https://github.com/nvim-treesitter/nvim-treesitter/wiki/Parser-Development | ||
// - Visibility: Prefer JS regex (/\n/) over literals ('\n') unless it should be | ||
// exposed to queries as an anonymous node. | ||
// - Rules starting with underscore are hidden in the syntax tree. | ||
|
||
/// <reference types="tree-sitter-cli/dsl" /> | ||
|
@@ -16,6 +18,11 @@ const _li_token = /[-•][ ]+/; | |
module.exports = grammar({ | ||
name: 'vimdoc', | ||
|
||
conflicts: $ => [ | ||
[$._line_noli, $._column_heading], | ||
[$._column_heading], | ||
], | ||
|
||
extras: () => [/[\t ]/], | ||
|
||
// inline: ($) => [ | ||
|
@@ -135,14 +142,14 @@ module.exports = grammar({ | |
'>', | ||
choice( | ||
alias(token.immediate(/[a-z0-9]+\n/), $.language), | ||
token.immediate('\n')), | ||
token.immediate(/\n/)), | ||
alias(repeat1(alias($.line_code, $.line)), $.code), | ||
// Codeblock ends if a line starts with non-whitespace. | ||
// Terminating "<" is consumed in other rules. | ||
)), | ||
|
||
// Lines. | ||
_blank: () => field('blank', '\n'), | ||
_blank: () => field('blank', /\n/), | ||
line: ($) => choice( | ||
$.column_heading, | ||
$.h1, | ||
|
@@ -156,18 +163,18 @@ module.exports = grammar({ | |
optional(token.immediate('<')), // Treat codeblock-terminating "<" as whitespace. | ||
_li_token, | ||
choice( | ||
alias(seq(repeat1($._atom), '\n'), $.line), | ||
alias(seq(repeat1($._atom), /\n/), $.line), | ||
seq(alias(repeat1($._atom), $.line), $.codeblock), | ||
), | ||
repeat(alias($._line_noli, $.line)), | ||
)), | ||
// Codeblock lines: must be indented by at least 1 space/tab. | ||
// Line content (incl. whitespace) is captured as a single atom. | ||
line_code: () => choice('\n', /[\t ]+[^\n]+\n/), | ||
line_code: () => choice(/\n/, /[\t ]+[^\n]+\n/), | ||
_line_noli: ($) => seq( | ||
choice($._atom_noli, $._uppercase_words), | ||
repeat($._atom), | ||
choice($.codeblock, '\n') | ||
choice($.codeblock, /\n/) | ||
), | ||
|
||
// Modeline: must start with "vim:" (optionally preceded by whitespace) | ||
|
@@ -177,31 +184,38 @@ module.exports = grammar({ | |
// Intended for table column names per `:help help-writing`. | ||
// TODO: children should be $.word (plaintext), not $.atom. | ||
column_heading: ($) => seq( | ||
field('name', seq(choice($._atom_noli, $._uppercase_words), repeat($._atom))), | ||
'~', | ||
token.immediate('\n'), | ||
alias($._column_heading, $.heading), | ||
alias('~', $.delimiter), | ||
token.immediate(/\n/), | ||
), | ||
// aliasing a seq exposes every item separately: create hidden rule and alias that | ||
_column_heading: $ => prec.dynamic(1, seq( | ||
choice($._atom_noli, $._uppercase_words), | ||
repeat($._atom) | ||
)), | ||
|
||
h1: ($) => | ||
seq( | ||
token.immediate(field('delimiter', /============+[\t ]*\n/)), | ||
repeat1($._atom), | ||
'\n', | ||
), | ||
prec(1, seq( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe TS considers declaration order as part of precedence. So possibly we could avoid There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I could make it tighter, but some precedence is needed to resolve the conflict between heading and possible taglinks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Only for terminals, e.g. string literals and regex patterns, and strings are higher than regex patterns by default |
||
alias(token.immediate(/============+[\t ]*\n/), $.delimiter), | ||
alias(repeat1($._atom), $.heading), | ||
optional(seq($.tag, repeat($._atom))), | ||
Comment on lines
+200
to
+201
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. everything before the first tag is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Depends. If we use a common node (not name!) for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And yes, I tried to be consistent across headings (even column headings, which as usual is a massive headache). |
||
/\n/, | ||
)), | ||
|
||
h2: ($) => | ||
seq( | ||
token.immediate(field('delimiter', /------------+[\t ]*\n/)), | ||
repeat1($._atom), | ||
'\n', | ||
), | ||
prec(1, seq( | ||
alias(token.immediate(/------------+[\t ]*\n/), $.delimiter), | ||
alias(repeat1($._atom), $.heading), | ||
optional(seq($.tag, repeat($._atom))), | ||
/\n/, | ||
)), | ||
|
||
// Heading 3: UPPERCASE NAME, followed by optional *tags*. | ||
h3: ($) => | ||
seq( | ||
field('name', $.uppercase_name), | ||
alias($.uppercase_name, $.heading), | ||
optional(seq($.tag, repeat($._atom))), | ||
'\n', | ||
/\n/, | ||
), | ||
|
||
tag: ($) => _word($, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JS regex are for parsing purposes; only use literals if you want to expose them as anonymous nodes to query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regex has lower priority than literals (oh but that's "only for terminals")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not about priority; it's what gets exposed to queries. "Hiding" stuff from queries is the primary way of keeping parser size down (and performance up).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Would be useful to add that tip here:
tree-sitter-vimdoc/grammar.js
Lines 2 to 5 in ce5ea84
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tips at home: https://github.com/nvim-treesitter/nvim-treesitter/wiki/Parser-Development
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see what you mean here. But here it's fine since the
token.immediate
takes care of it. (I tested it.)