-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid some unnecessary redirects #1953
base: gh-pages
Are you sure you want to change the base?
Conversation
The official `git-scm.com` website is currently built by Hugo with `--baseURL http://git-scm.com/`. This commit allows changing the URL scheme to `https`, in order to avoid unnecessary redirects between HTTPS and unencrypted HTTP. Such redirects occur when one of the following URLs is requested: * URLs in `layouts/alias.html`, e.g. in <https://git-scm.com/docs/> * Image URLs in <https://git-scm.com/application.min.css> * Endpoint URLs in <https://git-scm.com/sitemap.xml> Since these URLs have the `http` scheme, the client is supposed to send the new request unencrypted, only to be told by the server to use HTTPS again. Modern web browsers tend to stick with HTTPS in certain cases, so the downgrade may not always happen, but it's better to eliminate the redirects nevertheless, for security and performance reasons. Instead of trying to use relative URLs everywhere, this commit takes a simpler approach by using the `https` scheme in the base URL. One way to do this is to enable the "Enforce HTTPS" option in the GitHub Pages settings, but it's infeasible because the `git-scm.com` domain points to Cloudflare instead of GitHub. Therefore, this commit introduces a GitHub Actions variable, `EXTERNAL_HTTPS`, which can be set to true if HTTPS is provided by a third party, so that the URL scheme can be safely overridden. This also generalizes the special case of `git-scm.com` for any domain with a similar setup, allowing tests to be run more reliably in a uniform way. See-also: c22a1a5 (deploy(playwright): work around externally-enforced HTTPS, 2024-10-07) See-also: d4f88c1 (deploy: be more careful when auto-upgrading from HTTP -> HTTPS, 2024-10-07)
Since the download pages now live in `/downloads`, update the existing links to save an extra redirect.
I fear that this part is somewhat orthogonal to the purpose of this PR, and is caused by a misconfiguration of the site. When I direct my browser to https://github.com/git/git-scm.com/settings/pages, I see: The most important bit to which I would like to draw your attention, in particular, is this part:
That troubleshooting link has a couple of hints how to diagnose and fix this, but I have no access to the DNS records (neither do I want to, heck, I thought I was done over here in git-scm.com). Unless I am mistaken, this means that people who do control the domain git-scm.com, i.e. @peff and @ttaylorr, have to have a look at (and ideally fix) this part so that HTTPS is enforced on the GitHub side and the base URL is configured correctly to use https:// instead of http:// (currently, you can see that https://git-scm.com/ incorrectly links to http://git-scm.com/images/bg/body.jpg, for example, which is a consequence of this misconfiguration). |
Hmm. I initially thought that this GitHub hadn't rechecked GitHub did re-check those records, but the end result was the same. Looking through the rest of the troubleshooting documentation, it looks like this line is important:
While GitHub makes an exception for the the apex "www" (i.e. that having So unfortunately I don't think that it is currently possible to enforce HTTPS based on my understanding of those troubleshooting docs. 😞 |
Would GitHub's "enforce https" flag do anything anyway? Users visiting git-scm.com are terminating at Cloudflare, which is then talking to GitHub (well, GitHub's CDN) on the backend and proxying/caching. And Cloudflare does issue a redirect from http to https. So from the user's perspective, everything is always going to be https. And any links generated on the site should prefer https. I don't know anything about the Hugo setup, but naively I'd think that means the baseurl should be https (most links will just be relative, of course, but it looks like all of the scss stuff uses |
I also wonder if using Cloudflare is doing much these days. GitHub Pages are served off a CDN, too. So we are just stacking caches in front of caches (whereas when we originally started using Cloudflare, it was sitting in front of a Heroku dyno that was getting sorely hammered). |
I wondered the same when @dscho and I were redirecting the Cloudflare configuration to point at the new GitHub Pages site. TBH I think that we could probably drop it for similar reasons as the ones you point out, but it seemed like an extra change amid the already-large rewrite, so I punted on it then. It might be worth reevaluating, I dunno. |
Yes. It is responsible for As a consequence, as I mentioned before, resources like |
Thank you all for looking into this. While removing the Cloudflare caches is indeed a fix for the official site, I do want to mention that GitHub’s CDN (Fastly) can be less performant than Cloudflare in certain regions (which is also my personal experience), so there is a potential risk of inadvertently slowing down the site for some of the visitors.
I’m aware of this option, and this PR is an intentional workaround for it, in the belief that not everyone is able, or willing, to fix the “misconfiguration” and let GitHub handle their domain. Being able to configure |
Changes
EXTERNAL_HTTPS
will need to be (manually) set totrue
on this repository, either in the environment scope or in the repository scope. Forks of this repository may do the same based on instructions inARCHITECTURE.md
.--baseURL
option of Hugo for the officialgit-scm.com
site will be changed fromhttp://git-scm.com/
tohttps://git-scm.com/
(with only a different scheme).git-scm.com
.git-scm.com
will be tested with HTTPS, while other domains may be tested with HTTP if they don’t support HTTPS. This differs from the current strategy of using HTTP forgit-scm.com
and HTTPS for others.Context
Currently, clicking the most prominent download button on
https://git-scm.com/
causes three navigations:https://git-scm.com/download/win
, the hyperlink destination (assuming a Windows system), which is an alias to…http://git-scm.com/downloads/win
, with anhttp
scheme, resulting in a301 Moved Permanently
to…https://git-scm.com/downloads/win
, which is the canonical URL.This PR avoids both types of redundant redirects above (HTTPS to HTTP and
download
todownloads
) so the button (and some other links mentioned in the commit message) opens the canonical URL directly. Tests are also updated to accomondate for this change.In addition, this PR might serve as a better fix for #1898, which was supposed to be solved by #1899, by always running Playwright tests with HTTPS. However, that fix requires serveral conditions to work:
git-scm.com
was resolved to the GitHub Pages server in/etc/hosts
, other domains might be tested against a third‐party gateway, which needs to be able to speak HTTPS.https://git-scm.example/book
to redirect tohttps://git-scm.example/book/en/v2
, but this redirection depends on the base URL, so at least one of the following must be true:a. The base URL already has an
https
scheme.b. The tested server enforces HTTPS, so the final URL always says
https
even if the base URL doesn’t.The official domain (
git-scm.com
) didn’t meet condition 2, thus d4f88c1. The fork domain in #1898 (ttaylorr.com
), however, met all of 1, 2 and 3b, so it was fixed, but the issue remains for other domains.Conveniently, this PR lifts all the requirements above (respectively) by:
git-scm.com
domain.EXTERNAL_HTTPS=true
).In fact, the last change alone is sufficient to fix #1898, while the first two are kept for the original purpose of bypassing caches. This also removes the need to special‐case
git-scm.com
in the workflow, thereby simplifying the deployment logic.