Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Css3 parser tryout #286

Closed
wants to merge 9 commits into from
Closed

Css3 parser tryout #286

wants to merge 9 commits into from

Conversation

jaytaph
Copy link
Member

@jaytaph jaytaph commented Nov 27, 2023

This PR is a css3 parser created as much as possible according the specs to gain more knowledge about the css specification. Main goal would be fill in the blanks in the current css3 parser that we current have (new_parser). This PR tries to follow the specifications as close as possible.

Main entrypoints are the parse_* functions. There are multiple depending on what you like to parse (i'm not 100% sure how this actually works).
The consume_* functions are the parts that will consume tokens and return css structures like Rules, QualifiedValues etc.

@jaytaph jaytaph marked this pull request as draft November 27, 2023 08:17
Copy link
Member

@Sharktheone Sharktheone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't say too much about the functionality and the nom part

src/css3.rs Outdated
Comment on lines 37 to 41
if let ComponentValue::PreservedToken(_) = v {
true
} else {
false
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if let ComponentValue::PreservedToken(_) = v {
true
} else {
false
}
matches!(v, ComponentValue::PreservedToken(_))

src/css3.rs Outdated
} else {
panic!("we should not be here");
}
}).collect::<Vec<Token>>().into();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, you can remove the into here

Suggested change
}).collect::<Vec<Token>>().into();
}).collect::<Vec<Token>>();

src/css3.rs Outdated
Comment on lines 59 to 58
if self.index >= self.tokens.len() {
return Token::Eof;
}

self.tokens[self.index].clone()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably change this to

Suggested change
if self.index >= self.tokens.len() {
return Token::Eof;
}
self.tokens[self.index].clone()
self.tokens.get(self.index).unwrap_or(&Token::Eof).clone()

or if you don't want to clone the Token::Eof you can use

Suggested change
if self.index >= self.tokens.len() {
return Token::Eof;
}
self.tokens[self.index].clone()
if let Some(token) = self.tokens.get(self.index) {
token.clone()
} else {
Token::Eof
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same for the others)

src/css3.rs Outdated
// =================================================================================================

pub struct CSS3ParserTng<'a> {
token_stream: &'a mut dyn TokenStreamer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When i think about this, it's probably better to use Nico's suggestion. (with the lifetime in the Box) because then you can store the TokenStreamer actually in the struct. If it doesn't matters, you can leave it as it is.

src/css3.rs Outdated
tmp_input.push(token);
}
ComponentValue::Function(_function) => {
panic!("we should not have a function here");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume, you'll change this to a be a parser error in the future?

src/css3.rs Outdated
Comment on lines 818 to 820
pub fn display_tree(&self) {
self.display_tree_node(&self.tree, 0);
}

fn display_tree_node(&self, node: &Node, level: usize) {
println!("{}{}: {:?}", " ".repeat(level), node.name, node.attributes);
for child in &node.children {
self.display_tree_node(&child, level + 1);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn display_tree(&self) {
self.display_tree_node(&self.tree, 0);
}
fn display_tree_node(&self, node: &Node, level: usize) {
println!("{}{}: {:?}", " ".repeat(level), node.name, node.attributes);
for child in &node.children {
self.display_tree_node(&child, level + 1);
}
}
pub fn display_tree(&self) {
Self::display_tree_node(&self.tree, 0);
}
fn display_tree_node(node: &Node, level: usize) {
println!("{}{}: {:?}", " ".repeat(level), node.name, node.attributes);
for child in &node.children {
Self::display_tree_node(&child, level + 1);
}
}

src/css3.rs Outdated
Comment on lines 875 to 865
loop {
let t = input[idx].clone();
idx += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you somewhere exiting this loop? Because else it can (and probably will) panic because of an overflow. Maybe I miss something. (and please add the semicolons after the push and maybe change the double quotes to single ones in line 880)

src/css3/ast.rs Outdated

impl InputTake for Span<'_> {
fn take(&self, count: usize) -> Self {
let tokens = &self.0[0..count];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you give it a too high count here, it will panic


impl TokenStreamer for Tokenizer<'_> {
fn current(&self) -> Token {
if self.position == 0 || self.position > self.tokens.len() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.position == 0 || self.position > self.tokens.len() {
if self.position == 0 || self.position >= self.tokens.len() {

panic-alert

Comment on lines 69 to 78
let node = match combinator {
Some(combinator) => {
let mut node = Node::new("Combinator");
node.children.push(combinator);
node.children.push(selector);
node
}
None => {
selector
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if let for idiomacy

fn slice_index(&self, _count: usize) -> Result<usize, nom::Needed> {
todo!()
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want a simple unit test to work against, e.g., here. I would not attempt this purely from first principles, myself.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind. I see you have a unit test down below.

Instead of filtering out componentvalues that are not tokens, we use all
component values. This means that the Span does not contain just tokens,
but componentvalues. This allows us to detect functions etc.
@jaytaph
Copy link
Member Author

jaytaph commented Dec 11, 2023

Old tryout. Closing in favor of a new setup

@jaytaph jaytaph closed this Dec 11, 2023
@jaytaph jaytaph deleted the css-tmp-parser branch February 29, 2024 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants