Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml5ever: Bubble parser blocking scripts to the caller instead of the TreeSink #591

Merged

Conversation

simonwuelker
Copy link
Contributor

@simonwuelker simonwuelker commented Apr 1, 2025

This is the same approach used by html5ever, which will hopefully allow unifying their API a bit as part of the effort for encoding support.

This is a breaking change. The TreeSink::complete_script method (among other things) is removed.

@simonwuelker
Copy link
Contributor Author

The difference between the way html5ever and xml5ever block on script elements is best explained by looking at by the feed method for both parsers in servo:

The html parser pushes the input into the tokenizer and waits to see if any blocking scripts come back: https://github.com/servo/servo/blob/0d38d6239a4356928c2a7ec0a3cd53ecc32096b0/components/script/dom/servoparser/html.rs#L87-L94

The xml parser pushes the input into the tokenizer, waits until it stops and then checks whether the sink has received any blocking scripts: https://github.com/servo/servo/blob/0d38d6239a4356928c2a7ec0a3cd53ecc32096b0/components/script/dom/servoparser/xml.rs#L45-L51

@simonwuelker simonwuelker force-pushed the bubble-scripts-up-in-xml5ever branch 4 times, most recently from 76fa418 to c185ee9 Compare April 2, 2025 00:10
@simonwuelker simonwuelker marked this pull request as ready for review April 2, 2025 00:42
Comment on lines +66 to +67
// FIXME: Properly support </script> somehow.
let _ = self.tokenizer.feed(&self.input_buffer);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is okay, because it is exactly what html5ever does too:

self.input_buffer.push_back(t);
// FIXME: Properly support </script> somehow.
while let TokenizerResult::Script(_) = self.tokenizer.feed(&self.input_buffer) {}

Copy link
Member

@mrobinson mrobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me, though perhaps someone more familiar with html5ever should review.

… TreeSink

This is the same approach used by html5ever, which will hopefully allow
unifying their API a bit as part of the effort for encoding support.

This is a breaking change. The TreeSink::complete_script method is
removed.
@simonwuelker simonwuelker force-pushed the bubble-scripts-up-in-xml5ever branch from c185ee9 to 6ef6986 Compare April 2, 2025 10:31
Copy link
Member

@jdm jdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks for digging into this.

@simonwuelker simonwuelker added this pull request to the merge queue Apr 4, 2025
Merged via the queue into servo:main with commit 2aeedb7 Apr 4, 2025
6 checks passed
simonwuelker added a commit to simonwuelker/html5ever that referenced this pull request Apr 4, 2025
This should have been done as part of servo#591.
I forgot that servo depends on the crates.io version of html5ever, not
the git repository. So to upgrade in servo we need to release new
versions of all three crates.

Signed-off-by: Simon Wülker <[email protected]>
simonwuelker added a commit to simonwuelker/html5ever that referenced this pull request Apr 5, 2025
This should have been done as part of servo#591.
I forgot that servo depends on the crates.io version of html5ever, not
the git repository. So to upgrade in servo we need to release new
versions of all three crates.

Signed-off-by: Simon Wülker <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Apr 5, 2025
This should have been done as part of #591.
I forgot that servo depends on the crates.io version of html5ever, not
the git repository. So to upgrade in servo we need to release new
versions of all three crates.

Signed-off-by: Simon Wülker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants