Incremental parsers for node
npm install tree-sitter
First, you'll need a Tree-sitter grammar for the language you want to parse. There are many existing grammars such as tree-sitter-javascript and tree-sitter-go. You can also develop a new grammar using the Tree-sitter CLI.
Once you've got your grammar, create a parser with that grammar.
const Parser = require('tree-sitter');
const JavaScript = require('tree-sitter-javascript');
const parser = new Parser();
parser.setLanguage(JavaScript);
Then you can parse some source code,
const sourceCode = 'let x = 1; console.log(x);';
const tree = parser.parse(sourceCode);
and inspect the syntax tree.
console.log(tree.rootNode.toString());
// (program
// (lexical_declaration
// (variable_declarator (identifier) (number)))
// (expression_statement
// (call_expression
// (member_expression (identifier) (property_identifier))
// (arguments (identifier)))))
const callExpression = tree.rootNode.child(1).firstChild;
console.log(callExpression);
// { type: 'call_expression',
// startPosition: {row: 0, column: 16},
// endPosition: {row: 0, column: 30},
// startIndex: 0,
// endIndex: 30 }
If your source code changes, you can update the syntax tree. This will take less time than the first parse.
// Replace 'let' with 'const'
const newSourceCode = 'const x = 1; console.log(x);';
tree.edit({
startIndex: 0,
oldEndIndex: 3,
newEndIndex: 5,
startPosition: {row: 0, column: 0},
oldEndPosition: {row: 0, column: 3},
newEndPosition: {row: 0, column: 5},
});
const newTree = parser.parse(newSourceCode, tree);
If your text is stored in a data structure other than a single string, you can parse it by supplying a callback to parse
instead of a string:
const sourceLines = [
'let x = 1;',
'console.log(x);'
];
const tree = parser.parse((index, position) => {
let line = sourceLines[position.row];
if (line) {
return line.slice(position.column);
}
});