You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As far as I can tell, the tokens appear in the following ways:
as constants on the SymbolKind struct
wrapped in SymbolKind in the array SymbolKind::VALUES_ (which seems to just emulate an enum…?)
as constants on the Lexer
In the calculator example in the docs/test, the lexer and parser only needed to use the constants on Lexer, but more sophisticated projects might need tokens enumerated in the Value or Token types, which is another list.
I think there is a simpler way. Suppose we instead have a Symbol enumeration that has all of the tokens as variants as in the following:
// Use, e.g., the `enum-primitive-derive` crate for i32<->enum conversion.#[derive(Copy,Clone,Eq,PartialEq,Ord,PartialOrd,Debug,Primitive)]#[repr(i32)]pubenumSymbol{YYEmpty,YYEOF,YYerror,YYUNDEF,// ⋮ Whatever other "utility" variants are needed, // ⋮ so long as there are a statically known number of them.UserTerminalToken1,UserTerminalToken2,UserTerminalToken3,// ⋮ All other terminal tokens the user declared in the spec file.UserTerminalTokenN,UserNonterminalSymbol1,UserNonterminalSymbol2,UserNonterminalSymbol3,// ⋮ All other nonterminal symbols from the spec file.UserNonterminalSymbolM}
This enum is generated but can be used by the lexer or whatever other code might need it. Also, simple translation/conversion functions would be generated as in the following:
implSymbol{pubfnyychar_value(&self) -> i32{matchself{Symbol::YYEmpty => -2,Symbol::YYEOF => 0,// ⋮ Whatever other "special" values there are.Symbol::YYError => 256,Symbol::YYUndef => 257,
other => (other asi32) - (Symbol::UserNonterminalSymbol1asi32) + 258// This constant 258 should be statically known. It is the first token value for yychar.}}pubfnyytoken_value(&self) -> i32{selfasi32}/// The inverse of the `Symbol::yychar_value()` function.pubfnfrom_yychar(yychar:i32) -> Symbol{match yychar {
-2 => Symbol::YYEmpty,// ⋮ Whatever other "special" values there are.
i if i < 256 => Symbol::YYUndef,256 => Symbol::YYError,257 => Symbol::YYUndef,
i if i <= 256 + YYNTOKENS_ => Symbol::from_i32( i - 258 + (Symbol::UserNonterminalSymbol1asi32)).unwrap(),
_ => Symbol::YYUndef}}pubfnname(&self) -> &'staticstr{
yynames_[(selfasi32)asusize]}}
This has the advantages of:
moving the consts in Lexer to a dedicated enum
moving the consts in SymbolKind to a dedicated enum, eliminating the need for SymbolKind and all the SymbolKind::get() calls
making the yychar variable redundant altogether, replacing each read of yychar with a yytoken::yychar_value(), for example
making yytranslate_() and yytranslate_table_ unnecessary
eliminates Lexer::TOKEN_NAMES (which I think is redundant anyway...?)
I am not sure I have all the details correct in the code above, but it seems to me that something like this should work.
The text was updated successfully, but these errors were encountered:
As far as I can tell, the tokens appear in the following ways:
SymbolKind
structSymbolKind
in the arraySymbolKind::VALUES_
(which seems to just emulate an enum…?)Lexer
In the calculator example in the docs/test, the lexer and parser only needed to use the constants on
Lexer
, but more sophisticated projects might need tokens enumerated in theValue
orToken
types, which is another list.I think there is a simpler way. Suppose we instead have a
Symbol
enumeration that has all of the tokens as variants as in the following:This enum is generated but can be used by the lexer or whatever other code might need it. Also, simple translation/conversion functions would be generated as in the following:
This has the advantages of:
Lexer
to a dedicated enumSymbolKind
to a dedicated enum, eliminating the need forSymbolKind
and all theSymbolKind::get()
callsyychar
variable redundant altogether, replacing each read ofyychar
with ayytoken::yychar_value()
, for exampleyytranslate_()
andyytranslate_table_
unnecessaryLexer::TOKEN_NAMES
(which I think is redundant anyway...?)I am not sure I have all the details correct in the code above, but it seems to me that something like this should work.
The text was updated successfully, but these errors were encountered: