Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script element mid-parse protection mechanism #525

Open
lukewarlow opened this issue Jun 12, 2024 · 10 comments
Open

Script element mid-parse protection mechanism #525

lukewarlow opened this issue Jun 12, 2024 · 10 comments
Labels
Milestone

Comments

@lukewarlow
Copy link
Member

This issue is here to track the specific spec mechanism for protecting against a script that's edited mid-parse.

To avoid needing to look all over here's a summary. As of #499 all parsed scripts are trusted. The problem with this approach is that a script elements contents can be changed from JS before it finishes parsing. This is case tested by the first subtest of https://wpt.live/trusted-types/HTMLScriptElement-internal-slot.html and https://wpt.live/trusted-types/SVGScriptElement-internal-slot.html.

Chromium and WebKit's approach is to store a boolean flag on the HTML/SVGScriptElement which gets set to true, if the children of the script element are changed by an API. This flag is then checked before setting the script text data.

We cannot spec this directly due to the DOM specs children changed steps not providing the information as to whether something is changed by an API or parser. whatwg/dom#1288 - has been raised to discuss changes to the DOM spec which would be needed for this.

@lukewarlow lukewarlow added this to the v1 milestone Jun 12, 2024
@lukewarlow
Copy link
Member Author

cc @mbrodesser-Igalia it would be good if you or someone else from mozilla could provide input on whether Gecko has a similar concept to WebKit and Chromium here. If they do then it makes the case for standardising this, and specifically this protection mechanism better.

@smaug----
Copy link
Collaborator

Gecko doesn't have such concept.

@lukewarlow
Copy link
Member Author

What would gecko do in this situation then? Do you know how gecko would protect from such a situation?

@mbrodesser-Igalia
Copy link
Collaborator

Gecko doesn't have such concept.

To be explicit: it seems (https://searchfox.org/mozilla-central/search?q=children.*changed&path=&case=false&regexp=true) Gecko doesn't implement the children changed steps.

@mbrodesser-Igalia
Copy link
Collaborator

Gecko doesn't have such concept.

To be explicit: it seems (https://searchfox.org/mozilla-central/search?q=children.*changed&path=&case=false&regexp=true) Gecko doesn't implement the children changed steps.

For the record: Gecko implements the children changed steps via nsIMutationObserver (https://searchfox.org/mozilla-central/rev/ece43b04e7baa4680dac46a06d5ad42b27b124f4/dom/base/nsIMutationObserver.h#110).

@domfarolino
Copy link
Member

The problem with this approach is that a script elements contents can be changed from JS before it finishes parsing.

So one scenario in which a parser-inserted script should be considered untrusted is when it edits itself mid-parse, which I think is what the document.write test case highlights that you pointed to. I think another scenario would be if a parser-inserted script that cannot immediately execute (maybe it has an invalid type attribute) has its children mutated via an API before its type becomes valid again. This is also a case that our special boolean flag (should we ever add one) would have to capture, right?

(I guess I'm just trying to clarify whether the "mid-parse" part of the title of this issue is possibly too narrow, and is not the only case we need to consider)

@lukewarlow
Copy link
Member Author

I don't think a script can edit itself midparse?

#533 - here's my latest attempt at a model for this. Uses children changed steps always, and has some internal boolean flags. The main question I guess is does children changed ever get fired by the parser? If it doesn't then that PR should be good to go. I was under the impression it did, but I'm not so sure now.

I think another scenario would be if a parser-inserted script that cannot immediately execute (maybe it has an invalid type attribute) has its children mutated via an API before its type becomes valid again. This is also a case that our special boolean flag (should we ever add one) would have to capture, right?

THis should be covered by the existing system, and the new proposed one linked above.

Currently, it'll finish parsing, set the internal slot to the parsed value. It'll then have its child content changed but the slot not updated to match. By the time it gets to being executed the slot wont match reality and it'll be untrusted.

In the new proposed system, it'll finish parsing and be considered trusted. The children changed steps will be fired by the API change and that will set is trusted to false. By the time it gets to being executed the is trusted flag will be false.

@mbrodesser-Igalia
Copy link
Collaborator

This is case tested by the first subtest of https://wpt.live/trusted-types/HTMLScriptElement-internal-slot.html

The test-case seems wrong. Compare https://jsfiddle.net/pdqjyob5/2/ and https://jsfiddle.net/pdqjyob5/3/. Observe the <\/script>.

@lukewarlow
Copy link
Member Author

lukewarlow commented Jul 25, 2024

image

So the first write intentionally leaves it open. As it then does appendChild as the "mid-parse manipulation". And then continues to document.write the close tag. document.write is covered by trusted types as many other parsing APIs. So I don't think this script protection model needs to worry about the contents of it much. It's more just being used as a mechanism for triggering the weird behaviour.

@mbrodesser-Igalia
Copy link
Collaborator

mbrodesser-Igalia commented Jul 25, 2024

image

So the first write intentionally leaves it open. As it then does appendChild as the "mid-parse manipulation". And then continues to document.write the close tag.

Right.

document.write is covered by trusted types as many other parsing APIs. So I don't think this script protection model needs to worry about the contents of it much. It's more just being used as a mechanism for triggering the weird behaviour.

Yes, seems so.

Above test could be simplified, though:
https://searchfox.org/mozilla-central/rev/d353cfa1fbd207e13dc974f30e5f88535a4303ae/testing/web-platform/tests/trusted-types/HTMLScriptElement-internal-slot.html#20 is never executed.
https://searchfox.org/mozilla-central/rev/d353cfa1fbd207e13dc974f30e5f88535a4303ae/testing/web-platform/tests/trusted-types/HTMLScriptElement-internal-slot.html#18 inserts the second script successfully, because it isn't manipulated.

Edit: never mind, discussing this elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants