Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proper canonicalization of domain names #255

Closed
wants to merge 23 commits into from

Conversation

rubycon
Copy link
Contributor

@rubycon rubycon commented Jan 30, 2025

The current specs only handle host canonicalization for IP addresses and opaque hostnames but if an URL has a special scheme its host should be normalized as a domain name (IDNA processing).

Cf. #220

  • Tests are written and can be reviewed and commented upon at:
    • At least 2 tests already exist in WPT: Pattern: [{ "hostname": "xn--caf-dma.com" }], inputs= [{ "hostname": "café.com" }]; Pattern: ["http://🚲.com/"], Expected: { "protocol": "http", "hostname": "xn--h78h.com", "pathname": "/" }.
  • The top of this comment includes a clear commit message to use.

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

Yves-Marie and others added 22 commits March 15, 2024 14:39
…#233)

Per https://whatwg.org/working-mode#anchors this is being done to ensure
anchors to these concepts continue to work into the future.

Resolves whatwg#231.
These examples form the only quick reference material in the spec. While MDN goes into more depth, brief explanations of what is happening in this syntax don't take up much room and may make it easier to tell what's going on, especially with regexp parts.

The example explaining pattern strings is also expanded to be explicit that regular expressions are enclosed in parentheses.

Resolves whatwg#229.
Per discussion on whatwg#182 some text explaining this would be useful.

This mostly consists of advice since the useful algorithms are already exposed.
This PR fixes pathname processing for inputs that have opaque pathnames.
This PR fixes pathname processing for inputs that have opaque pathnames.

(Amended by editor to apply to the correct draft.)
Reverting the review draft change caused a commit that the deploy script
doesn't like. This one doesn't change that file so should succeed.
Use the basic URL parser when parsing URLs

Blob handling is not required, since the resulting URL is not stored in
any way that would make the blob handling visible. This is consistent
with what the URL constructor and similar uses do.

This change should not be observable.

Partially addresses whatwg#242.
Editorial: Correct "set" to "let"

These uses introduce the variable, so must be "set".
If the URL's host is an IP address, the string representation is
required.

This corrects/clarifies existing behavior.
The current specs only handle host canonicalization for IP addresses and opaque hostnames but if an URL has a special scheme its host should be normalized as a domain name (IDNA processing).

Cf. whatwg#220
@annevk
Copy link
Member

annevk commented Mar 24, 2025

I think this will be fixed by #252. The dummy URL will have a special scheme.

@rubycon
Copy link
Contributor Author

rubycon commented Mar 27, 2025

Fully resolved by #269

@rubycon rubycon closed this Mar 27, 2025
@annevk
Copy link
Member

annevk commented Mar 27, 2025

I think supporting opaque hosts might still be a good change.

@rubycon
Copy link
Contributor Author

rubycon commented Mar 28, 2025

@annevk Then I might resubmit another PR to treat all non-empty non-special scheme as having opaque hosts. Does that sound like a good default behavior ? Some additional tests for WPT might be needed too.

@annevk
Copy link
Member

annevk commented Mar 31, 2025

Yeah, I think that would be reasonable.

@sisidovski
Copy link
Collaborator

Agree, the current spec and tests don't consider opaque hosts, but ideally we should support it.

non-empty non-special scheme as having opaque hosts

It sounds a good direction!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants