Script element mid-parse protection mechanism #525

lukewarlow · 2024-06-12T14:30:34Z

This issue is here to track the specific spec mechanism for protecting against a script that's edited mid-parse.

To avoid needing to look all over here's a summary. As of #499 all parsed scripts are trusted. The problem with this approach is that a script elements contents can be changed from JS before it finishes parsing. This is case tested by the first subtest of https://wpt.live/trusted-types/HTMLScriptElement-internal-slot.html and https://wpt.live/trusted-types/SVGScriptElement-internal-slot.html.

Chromium and WebKit's approach is to store a boolean flag on the HTML/SVGScriptElement which gets set to true, if the children of the script element are changed by an API. This flag is then checked before setting the script text data.

We cannot spec this directly due to the DOM specs children changed steps not providing the information as to whether something is changed by an API or parser. whatwg/dom#1288 - has been raised to discuss changes to the DOM spec which would be needed for this.

lukewarlow · 2024-06-12T14:31:36Z

cc @mbrodesser-Igalia it would be good if you or someone else from mozilla could provide input on whether Gecko has a similar concept to WebKit and Chromium here. If they do then it makes the case for standardising this, and specifically this protection mechanism better.

smaug---- · 2024-07-03T12:33:38Z

Gecko doesn't have such concept.

lukewarlow · 2024-07-03T12:56:38Z

What would gecko do in this situation then? Do you know how gecko would protect from such a situation?

mbrodesser-Igalia · 2024-07-03T14:14:06Z

Gecko doesn't have such concept.

To be explicit: it seems (https://searchfox.org/mozilla-central/search?q=children.*changed&path=&case=false&regexp=true) Gecko doesn't implement the children changed steps.

mbrodesser-Igalia · 2024-07-04T13:04:27Z

Gecko doesn't have such concept.

To be explicit: it seems (https://searchfox.org/mozilla-central/search?q=children.*changed&path=&case=false&regexp=true) Gecko doesn't implement the children changed steps.

For the record: Gecko implements the children changed steps via nsIMutationObserver (https://searchfox.org/mozilla-central/rev/ece43b04e7baa4680dac46a06d5ad42b27b124f4/dom/base/nsIMutationObserver.h#110).

domfarolino · 2024-07-23T22:22:43Z

The problem with this approach is that a script elements contents can be changed from JS before it finishes parsing.

So one scenario in which a parser-inserted script should be considered untrusted is when it edits itself mid-parse, which I think is what the document.write test case highlights that you pointed to. I think another scenario would be if a parser-inserted script that cannot immediately execute (maybe it has an invalid type attribute) has its children mutated via an API before its type becomes valid again. This is also a case that our special boolean flag (should we ever add one) would have to capture, right?

(I guess I'm just trying to clarify whether the "mid-parse" part of the title of this issue is possibly too narrow, and is not the only case we need to consider)

lukewarlow · 2024-07-24T13:03:01Z

I don't think a script can edit itself midparse?

#533 - here's my latest attempt at a model for this. Uses children changed steps always, and has some internal boolean flags. The main question I guess is does children changed ever get fired by the parser? If it doesn't then that PR should be good to go. I was under the impression it did, but I'm not so sure now.

I think another scenario would be if a parser-inserted script that cannot immediately execute (maybe it has an invalid type attribute) has its children mutated via an API before its type becomes valid again. This is also a case that our special boolean flag (should we ever add one) would have to capture, right?

THis should be covered by the existing system, and the new proposed one linked above.

Currently, it'll finish parsing, set the internal slot to the parsed value. It'll then have its child content changed but the slot not updated to match. By the time it gets to being executed the slot wont match reality and it'll be untrusted.

In the new proposed system, it'll finish parsing and be considered trusted. The children changed steps will be fired by the API change and that will set is trusted to false. By the time it gets to being executed the is trusted flag will be false.

mbrodesser-Igalia · 2024-07-25T08:45:18Z

This is case tested by the first subtest of https://wpt.live/trusted-types/HTMLScriptElement-internal-slot.html

The test-case seems wrong. Compare https://jsfiddle.net/pdqjyob5/2/ and https://jsfiddle.net/pdqjyob5/3/. Observe the <\/script>.

lukewarlow · 2024-07-25T09:10:05Z

So the first write intentionally leaves it open. As it then does appendChild as the "mid-parse manipulation". And then continues to document.write the close tag. document.write is covered by trusted types as many other parsing APIs. So I don't think this script protection model needs to worry about the contents of it much. It's more just being used as a mechanism for triggering the weird behaviour.

mbrodesser-Igalia · 2024-07-25T10:07:32Z

So the first write intentionally leaves it open. As it then does appendChild as the "mid-parse manipulation". And then continues to document.write the close tag.

Right.

document.write is covered by trusted types as many other parsing APIs. So I don't think this script protection model needs to worry about the contents of it much. It's more just being used as a mechanism for triggering the weird behaviour.

Yes, seems so.

Above test could be simplified, though:
https://searchfox.org/mozilla-central/rev/d353cfa1fbd207e13dc974f30e5f88535a4303ae/testing/web-platform/tests/trusted-types/HTMLScriptElement-internal-slot.html#20 is never executed.
https://searchfox.org/mozilla-central/rev/d353cfa1fbd207e13dc974f30e5f88535a4303ae/testing/web-platform/tests/trusted-types/HTMLScriptElement-internal-slot.html#18 inserts the second script successfully, because it isn't manipulated.

Edit: never mind, discussing this elsewhere.

lukewarlow added the spec label Jun 12, 2024

lukewarlow added this to the v1 milestone Jun 12, 2024

This was referenced Jun 12, 2024

Update HTML Parser steps for script element to set "script text" #499

Merged

Issue with script enforcement #437

Closed

Add script protection mechanisms to SVGScriptElement #524

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Script element mid-parse protection mechanism #525

Script element mid-parse protection mechanism #525

lukewarlow commented Jun 12, 2024

lukewarlow commented Jun 12, 2024

smaug---- commented Jul 3, 2024

lukewarlow commented Jul 3, 2024

mbrodesser-Igalia commented Jul 3, 2024

mbrodesser-Igalia commented Jul 4, 2024

domfarolino commented Jul 23, 2024

lukewarlow commented Jul 24, 2024

mbrodesser-Igalia commented Jul 25, 2024

lukewarlow commented Jul 25, 2024 •

edited

Loading

mbrodesser-Igalia commented Jul 25, 2024 •

edited

Loading

Script element mid-parse protection mechanism #525

Script element mid-parse protection mechanism #525

Comments

lukewarlow commented Jun 12, 2024

lukewarlow commented Jun 12, 2024

smaug---- commented Jul 3, 2024

lukewarlow commented Jul 3, 2024

mbrodesser-Igalia commented Jul 3, 2024

mbrodesser-Igalia commented Jul 4, 2024

domfarolino commented Jul 23, 2024

lukewarlow commented Jul 24, 2024

mbrodesser-Igalia commented Jul 25, 2024

lukewarlow commented Jul 25, 2024 • edited Loading

mbrodesser-Igalia commented Jul 25, 2024 • edited Loading

lukewarlow commented Jul 25, 2024 •

edited

Loading

mbrodesser-Igalia commented Jul 25, 2024 •

edited

Loading