Request for mechanism to determine if children changed steps are called as a result of script or parser #1288

lukewarlow · 2024-05-16T12:55:35Z

What problem are you trying to solve?

In TrustedTypes we have a need to distinguish between a script's children being changed by an API (e.g. script.appendChild) or changed by the parser.

This is specifically needed as we treat all parsed scripts as trusted by default. However, it's possible for a script to be manipulated before it's finished parsing, and in those cases it shouldn't be trusted.

What solutions exist today?

Both Chromium and WebKit implement a childrenChanged function on their ContainerNode type which I believe translates to the children changed steps concept defined in the DOM algorithm. However, unlike the spec Chromium and WebKit both provide an argument to the function with details, such as the type (what changed and how it changed), and importantly the source of the change (parser vs API).

We can (and do) use this to invalidate scripts in the above described situation.

How would you solve it?

It would be good if the DOM spec matched the reality of at least 2 implementations (unsure about Gecko), and provided details about the change, at the very least for this use case the source of the change.

Anything else?

See w3c/trusted-types#499 (comment) for a long discussion that lead to this issue being raised.

I'm open to any alternative spec mechanisms we can use to spec this behaviour, so if there's a way today or a simpler change we can use let me know.

annevk · 2024-05-16T14:02:02Z

cc @hsivonen @smaug---- @rniwa @domfarolino @noamr

lukewarlow · 2024-05-16T14:22:05Z

cc @otherdaniel as he did the Chromium implementation

domfarolino · 2024-06-14T14:26:18Z

I am interested in two things here.

The integrity of the "is changed by API" bit that Chromium maintains. Are we positive that it is accurate in the ways we expect it to be here? I did just a little bit of digging and I found that unconditionally in the ContainerNode::RemoveChildren() path, the ChildrendChangedSource is always kAPI. But this is a very broad method that covers a ton of cases, and I'm wondering if it sometimes covers parser-only cases that get treated as kAPI cases incorrectly. For example, I found that when an HTMLInputElement gets its type updated, possibly as a result of a parser(I think), we go through this flow: HTMLInputElement::ParseAttribute --> HTMLInputElement::UpdateType --> HTMLInputElement::SetInnerEditorValue() --> ReplaceChildrenWithText --> ContainerNode::RemoveChildren. This kind of seems like innocent parser changes can trigger the children changed flow with the kAPI flag set incorrectly. I'll be honest I haven't spent too much time debugging this case to know whether my speculation is right or not, but specifically I'm interested in understanding if all browsers (that implement this extra non-specified bit) match in all scenarios.
The point made by @annevk in Update HTML Parser steps for script element to set "script text" w3c/trusted-types#499 (comment) about whether we derive this bit of information based on (a) "invalidation" (I am actually not sure what this means. Is there a common dfn somewhere), or (b) comparing pre-insert and post-insert children contents to detect mid-parsing API intervention ¹.

One point I am not clear on is: is it possible for a node's children to be changed during parser as a result of a non-API action? Like, can the parser itself be responsible for under any scenario? ↩

annevk · 2024-06-17T08:08:08Z

By "invalidation" I meant having the script element maintain a boolean that gets set to false (invalidated) when non-parser manipulation has taken place. This seems similar to the "is changed by API" you describe in 1 above.

Such a setup seems vastly preferable over maintaining a complete separate copy of the contents of the script element, to the extent that is even viable to begin with. For instance, is there a clear difference between pre-insert and post-insert? There's a clear point when we hit </script>, but until that code points can be streaming in from various places.

lukewarlow · 2024-06-18T09:07:38Z

I'll take some more time to properly read through and respond but regarding how correct the "is changed by API" bit is.

Chromium used to (and right now WebKit still does) have at least 1 situation where a parser change is marked as API and so the script is untrusted. The XML parser would append nodes as if they were an API change. This is because a legacy deprecated SVG element (tref) is built on mutation events, and so it went down the path that fired mutation events. Chromium simply removed that element and fixed the parser, WebKit is slightly more complicated because it might be used in non-web places, BUT that's looking like the end result too.

This is the only (relevant) place that's wrong from my testing but it's possible there could be other mismatches.

I'm wondering if it sometimes covers parser-only cases that get treated as kAPI cases incorrectly.

The main thing is we need all API changes to show as such, if some parser steps pretends it's an API that's not the end of the world, especially because in it's current form it's just used during script element parse time.

One point I am not clear on is: is it possible for a node's children to be changed during parser as a result of a non-API action? Like, can the parser itself be responsible for under any scenario?

Yes, my understanding is that the parser itself will fire children changed steps during the course of parsing.

domfarolino · 2024-06-18T16:04:16Z

This is the only (relevant) place that's wrong from my testing but it's possible there could be other mismatches.

That is good to know. Web platform tests confirming this across the two implementations would be lovely to see though.

Yes, my understanding is that the parser itself will fire children changed steps during the course of parsing.

Right but I guess the question is: do these "children changed" steps in any specs actually modify the children of the node they are being called on? I don't think <script> processing uses the children changed steps today. After whatwg/html#10188 though, the "children changed steps" will be used for scripts (see this example), which will run scripts, but any further children modification done as a result of that I guess will be marked as an "API-initiated change" Anyways, I realize my initial question here was very dumb, since yes of course the parser can add children to nodes. That's like its only job when parsing a page.

lukewarlow · 2024-07-09T11:29:11Z

Following up on this I've opened a draft PR on Trusted Types which tries to advaced the script protection. I've moved to a new model after discussions with Anne and others which be simpler to understand.

The idea is that a script element has two flags, one to say it's trusted which is initially true (so parsed scripts are trusted), and one to say it's been modified by a trusted sink.

The model relies on the children changed steps and a new flag "parser change" for those steps which can be used to differentiate between API or Parser. See w3c/trusted-types#533 for the specifics.

So the big missing piece of the puzzle is how to get that new "parser change" flag specced. It seems that the children changed steps aren't neccessarily called directly from the parser so it might require piping the flag through multiple layers of indirection?

I also discussed with @smaug---- about how feasible this information is to get in Gecko and while the implementation is different from Webkit and Chromium, it's sounds like it should be easy enough to get the neccessary information.

lukewarlow added needs implementer interest Moving the issue forward requires implementers to express interest addition/proposal New features or enhancements labels May 16, 2024

lukewarlow mentioned this issue May 16, 2024

Update HTML Parser steps for script element to set "script text" w3c/trusted-types#499

Merged

annevk added topic: nodes needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan and removed needs implementer interest Moving the issue forward requires implementers to express interest labels May 16, 2024

lukewarlow mentioned this issue Jun 12, 2024

Script element mid-parse protection mechanism w3c/trusted-types#525

Open

lukewarlow mentioned this issue Jul 4, 2024

Change Script Enforcement Mechanism to use flags w3c/trusted-types#533

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for mechanism to determine if children changed steps are called as a result of script or parser #1288

Request for mechanism to determine if children changed steps are called as a result of script or parser #1288

lukewarlow commented May 16, 2024

annevk commented May 16, 2024 •

edited

Loading

lukewarlow commented May 16, 2024

domfarolino commented Jun 14, 2024

annevk commented Jun 17, 2024

lukewarlow commented Jun 18, 2024

domfarolino commented Jun 18, 2024

lukewarlow commented Jul 9, 2024 •

edited

Loading

Request for mechanism to determine if children changed steps are called as a result of script or parser #1288

Request for mechanism to determine if children changed steps are called as a result of script or parser #1288

Comments

lukewarlow commented May 16, 2024

What problem are you trying to solve?

What solutions exist today?

How would you solve it?

Anything else?

annevk commented May 16, 2024 • edited Loading

lukewarlow commented May 16, 2024

domfarolino commented Jun 14, 2024

Footnotes

annevk commented Jun 17, 2024

lukewarlow commented Jun 18, 2024

domfarolino commented Jun 18, 2024

lukewarlow commented Jul 9, 2024 • edited Loading

annevk commented May 16, 2024 •

edited

Loading

lukewarlow commented Jul 9, 2024 •

edited

Loading