Skip to content

Commit 8639616

Browse files
authored
Native git support: lsRefs(), sparseCheckout(), GitPathControl (#1764)
## Motivation Related to #1787 Adds a set of TypeScript functions that support the native git protocol and can power a sparse checkout feature. This is the basis for a faster, more user-friendly git integration. No more guessing repository paths. Just provide the repo URL, browse the files, and tell Playground which directories are plugins, themes, etc. Technically, this PR performs [git sparse checkout using just JavaScript](https://adamadam.blog/2024/06/21/cloning-a-git-repository-from-a-web-browser-using-fetch/page/1) and a generic CORS proxy. **This PR doesn't provide any user-facing feature yet.** However, it paves the way to features like: * Checkout any git repo, even non-GitHub ones, without going through the OAuth flow * Retrieve a subset of the files directly from the repo and without going through zipballs. * Provide a visual git repo browser (instead of asking the user to manually type the path) * Introduce a new Blueprint resource type: git repo * Fetch the names of all the repository branches (or just the branches with the specified prefix) * (future) commit and push to any git repo, even non-GitHub ones ## Notable points of this PR * Exposes the `sparseCheckout()`, `lsRefs()`, and `listFiles()` functions from the `@wp-playground/storage` package. I'm not yet sure whether we need a dedicated `@wp-playground/git` package or not. * Ships basic unit test coverage for those functions. * Silences a few warnings in the CORS proxy. CC @brandonpayton we may not want to do that in the production release. * Adds `isomorphic-git` as a git submodules in the `/isomorphic-git` path. We can't rely in the published npm package because it doesn't export the internal APIs we need to use here. * Adds a bunch of WIP components in `@wp-playground/components`. They're not used anywhere on the website yet and I'd rather keep them moving with the project than isolate them in a PR until they're perfect. We'll need some accessibility and mobile testing before using them in the webapp, though. ## How does it even work? Let me quote [my own article](https://adamadam.blog/2024/06/21/cloning-a-git-repository-from-a-web-browser-using-fetch/): ### Running a Git Client in the browser The good news was [isomorphic-git](https://github.com/isomorphic-git/isomorphic-git), [wasm-git](https://github.com/petersalomonsen/wasm-git), and a few other projects were already running Git in the browser. The bad news was none of them supported fetching a subset of files via [sparse checkout](https://git-scm.com/docs/git-sparse-checkout). You’d still have to download 20MB of data even if you only wanted 100KB. However, Everything the desktop Git client does, including sparse checkouts, can be done via [HTTP](https://git-scm.com/docs/http-protocol/2.5.6) by requesting URLs like [https://github.com/WordPress/wordpress-playground.git](https://github.com/isomorphic-git/isomorphic-git.git). Git [documentation](https://git-scm.com/) was… less than helpful, but eventually it worked! A few hours later I was running Git commands by sending GET and POST requests to the repository-URLs. ### Fetching a hash of the branch The first command I needed was ls-refs to get the SHA1 hash of the right git branch. Here’s how you can get it with fetch() for the HEAD branch of the WordPress/wordpress-playground repo: ```ts const response = await fetch( 'https://github.com/WordPress/gutenberg.git/git-upload-pack', { method: 'POST', headers: { 'Accept': 'application/x-git-upload-pack-advertisement', 'content-type': 'application/x-git-upload-pack-request', 'Git-Protocol': 'version=2' }, body: [ `0014command=ls-refs\n`, // ^^^^ line length in hex `0015agent=git/2.37.3\n`, `0017object-format=sha1\n`, '0001', // ^^^^ command separator // Filter the results to only contain the HEAD branch, // otherwise it will return all the branches and // tags which may require downloading many // megabytes of data: `0009peel\n`, `0014ref-prefix HEAD\n`, '0000', // ^^^^ end of request ].join(""), } ); ``` I won’t go into details of the Git protocol – the point is with a few special headers and lines you can be a Git client. If you paste that fetch() in your devtools while on GitHub.com, it would return a response similar to this: ``` 0032950f5c8239b6e78e9051ec5e845bac5aa863c4cb HEAD 0000 ``` Good! That’s our commit hash. Fetching a list of objects at a specific commit With this, we can fetch [the list of objects](https://git-scm.com/book/en/v2/Git-Internals-Git-Objects) in that branch: ```ts fetch("https://github.com/wordpress/gutenberg/git-upload-pack", { "headers": { "accept": "application/x-git-upload-pack-advertisement", "content-type": "application/x-git-upload-pack-request", }, "referrer": "http://localhost:8000/", "referrerPolicy": "strict-origin-when-cross-origin", "body": [ `0088want 950f5c8239b6e78e9051ec5e845bac5aa863c4cb multi_ack_detailed no-done side-band-64k thin-pack ofs-delta agent=git/2.37.3 filter \n`, `0015filter blob:none\n`, // ^ sparse checkout secret says. // only fetches a list of objects without // their content `0035shallow 950f5c8239b6e78e9051ec5e845bac5aa863c4cb\n`, `000ddeepen 1\n`, `0000`, `0009done\n`, `0009done\n`, ].join(""), "method": "POST" }); ``` And here’s the response: ``` 00000008NAK 0026�Enumerating objects: 2189, done. 0025�Counting objects: 0% (1/2189) ... 0032�Compressing objects: 100% (1568/1568), done. 2004�PACK��(binary data) 0040 Total 2189 (delta 1), reused 1550 (delta 0), pack-reused 0 0006��0000 ``` The binary data after PACK is a compressed list of all objects the repository had at commit `950f5c8239b6e78e9051ec5e845bac5aa863c4cb`. It is not a list of files that were committed in `950f5c`. It’s all files. The [pack format](https://git-scm.com/docs/pack-format) is a binary blob. It’s similar to [ZIP](https://en.wikipedia.org/wiki/ZIP_(file_format)) in that it encodes of a series of objects encoded as a binary header followed by binary data. Here’s an approximate visual to help grok the idea: ``` PACK format – inaccurate explanation, Pack consists of the string "PACK" and binary data structured roughly as follows: ___________________________________ | | | ASCII string "PACK" | | Binary data starts | | Pack Header | |___________________________________| | | | Offset 0x0010 | | Object 1 Header | (Object type, hash, | | data length, etc.) | ________________ | | | | | | | Object 1 Data | | (Gzipped data) | |________________| | | | | Offset 0x0050 | | Object 2 Header | | | | ________________ | | | | | | | Object 2 Data | | (Gzipped data) | |________________| | |___________________________________| | | | Pack Footer | | Binary data ends | |___________________________________| ``` The decoding is tedious so I used [the decoder](https://github.com/isomorphic-git/isomorphic-git/blob/main/src/models/GitPackIndex.js) provided by isomorphic Git package: ```ts const iterator = streamToIterator(await response.body); const parsed = await parseUploadPackResponse(iterator); const packfile = Buffer.from(await collect(parsed.packfile)); const index = await GitPackIndex.fromPack({ pack: packfile }); ``` The parsed index object provides information about all the objects encoded in the received packfile. Let’s peek inside: ``` { // ... "hashes": [ "5f4f0a5367476fdb7c98ffa5fa35300ec4c3f48b", "950f5c8239b6e78e9051ec5e845bac5aa863c4cb", // ... ], "offsets": { "5f4f0a5367476fdb7c98ffa5fa35300ec4c3f48b": 12, "950f5c8239b6e78e9051ec5e845bac5aa863c4cb": 181, // ... }, "offsetCache": { "12": { "type": "tree", "object": "100644 async-http-download.php\u0000��p4��\u0014�g\u0015i��\u0004��\\���100644 async-http.php\u0000�\n�8K�RT������F\u001b8�� (more binary data)" }, // ... }, "readDepth": 4, "externalReadDepth": 0 } ``` Each object has a type and some data. The decoder stored some objects in the offsetCache, and kept track of others in form of a hash => offset in packfile mapping. Let’s read the details of the commit from our parsed index: ```ts > const commit = await index.read({ oid: '950f5c8239b6e78e9051ec5e845bac5aa863c4cb' }); { "type": "commit", "object": "tree c7b8440c83b8c987895f9a1949650eb60bccd2ec\nparent b6132f2d381865353e09edf88aa64a0dd042811a\nauthor Adam Zieliński <[email protected]> 1717689108 +0200\ncommitter Adam Zieliński <[email protected]> 1717689108 +0200\n\nUpdate rebuild workflow\n" } ``` It’s the object type, the hash, and the uncompressed object bytes which, in this case, provide us commit details in a specific microformat. From here, we can get the tree hash and look for its details in the same index we’ve already downloaded: ```ts > const tree = await index.read({ oid: "c7b8440c83b8c987895f9a1949650eb60bccd2ec" }) { "type": "tree", "object": "40000 .github\u0000_O\nSgGo�|����50\u000e���40000 (... binary data ...)" } ``` The contents of the tree object is a list of files in the repository. Just like with commit, tree details are encoded in their own microformat. Luckily, isomorphic-git ships relevant decoders: ```ts > GitTree.from(result.object).entries() [ { "mode": "040000", "path": ".github", "oid": "ece277ec006eb517d5c5399d7a5c00b7e61018f1", "type": "blob" }, { "mode": "100644", "path": "readme.txt", "oid": "3fe6e3aaf1dc4df204be575041383fc8e2e1e070", "type": "blob" }, { "mode": "040000", "path": "src", "oid": "dbc84f20ee64fbd924617b41ee0e66128c9a8d97", "type": "tree" }, // ... ] ``` Yay! That’s the list of files and directories in the repository root with there hashes! From here we can recursively retrieve the ones relevant for our sparse checkout. ### Fetching full files from specific paths We’re finally ready to checkout a few particular paths. Let’s ask for a blob at readme.txt and a tree at docs/tools: ```ts const response = fetch("https://github.com/wordpress/gutenberg/git-upload-pack", { "headers": { "accept": "application/x-git-upload-pack-advertisement", "content-type": "application/x-git-upload-pack-request", }, "body": [ `0081want 28facb763312f40c9ab3251fb91edb87c8476cf9 multi_ack_detailed no-done side-band-64k thin-pack ofs-delta agent=git/2.37.3\n`, `0081want 3fe6e3aaf1dc4df204be575041383fc8e2e1e070 multi_ack_detailed no-done side-band-64k thin-pack ofs-delta agent=git/2.37.3\n`, `00000009done` ].join(""), "method": "POST" }); ``` The response is another index, but this time each blob comes with binary contents. Some decoding and recursive processing later, we finally get this: ```ts { "readme.txt": "=== Gutenberg ===\nContri (...)", "docs/tool": { "index.js": "/**\n * External depe (...)", "manifest.js": "/* eslint no-console (...)" } } ``` Yay! It took some effort, but it was worth it! ### Cors proxy and other notes You’ll still need to run a CORS proxy. The fetch() examples above will work if you try them in devtools on github.com, but you won’t be able to just use them on your site. Git API typically does not expose the Access-Control-* headers required by the browser to run these requests. So we need a server after all. Was this a failure, then? No! A CORS proxy is cheaper, simpler, and safer to maintain than a Git service. Also, it can fetch all the files in 3 fetch() requests instead of two requests per file like the GitHub REST API requires. #### Try it yourself I’ve shared a functional demo that includes a CORS proxy in this repository on GitHub: https://github.com/adamziel/git-sparse-checkout-in-js ## Testing instructions * Start two terminals * Run `nx dev playground-components` in the first one * Run `nx start playground-php-cors-proxy` in the second one to start the PHP Cors proxy * Go to http://localhost:5173/ and play with the UI * Play with an early demo of git repository browser shipped in this PR: https://github.com/user-attachments/assets/731b2a89-8004-4d0b-8c6f-8646d4840a29
1 parent 1035a25 commit 8639616

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+1733
-453
lines changed

.github/actions/prepare-playground/action.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ runs:
55
steps:
66
- name: Fetch trunk
77
shell: bash
8-
run: git fetch origin trunk --depth=1
8+
run: git fetch origin trunk --depth=1 --recurse-submodules
99
- uses: actions/setup-node@v4
1010
with:
1111
node-version: 18

.github/workflows/build-website.yml

+3-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,9 @@ jobs:
2929
environment:
3030
name: playground-wordpress-net-wp-cloud
3131
steps:
32-
- uses: actions/checkout@v3
32+
- uses: actions/checkout@v4
33+
with:
34+
submodules: true
3335
- uses: ./.github/actions/prepare-playground
3436
- run: npm run build
3537
- run: tar -czf wasm-wordpress-net.tar.gz dist/packages/playground/wasm-wordpress-net

.github/workflows/ci.yml

+14
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ jobs:
1818
runs-on: ubuntu-latest
1919
steps:
2020
- uses: actions/checkout@v4
21+
with:
22+
submodules: true
2123
- uses: ./.github/actions/prepare-playground
2224
- run: npx nx affected --target=lint
2325
- run: npx nx affected --target=typecheck
@@ -26,6 +28,8 @@ jobs:
2628
needs: [lint-and-typecheck]
2729
steps:
2830
- uses: actions/checkout@v4
31+
with:
32+
submodules: true
2933
- uses: ./.github/actions/prepare-playground
3034
- run: node --expose-gc node_modules/nx/bin/nx affected --target=test --configuration=ci
3135
test-e2e:
@@ -34,6 +38,8 @@ jobs:
3438
# Run as root to allow node to bind to port 80
3539
steps:
3640
- uses: actions/checkout@v4
41+
with:
42+
submodules: true
3743
- uses: ./.github/actions/prepare-playground
3844
- run: sudo ./node_modules/.bin/cypress install --force
3945
- run: sudo CYPRESS_CI=1 npx nx e2e playground-website --configuration=ci --verbose
@@ -49,6 +55,8 @@ jobs:
4955
needs: [lint-and-typecheck]
5056
steps:
5157
- uses: actions/checkout@v4
58+
with:
59+
submodules: true
5260
- uses: ./.github/actions/prepare-playground
5361
- name: Install Playwright Browsers
5462
run: sudo npx playwright install --with-deps
@@ -70,6 +78,8 @@ jobs:
7078
part: ['chromium', 'firefox', 'webkit']
7179
steps:
7280
- uses: actions/checkout@v4
81+
with:
82+
submodules: true
7383
- uses: ./.github/actions/prepare-playground
7484
- name: Download dist
7585
uses: actions/download-artifact@v4
@@ -104,6 +114,8 @@ jobs:
104114
needs: [lint-and-typecheck]
105115
steps:
106116
- uses: actions/checkout@v4
117+
with:
118+
submodules: true
107119
- uses: ./.github/actions/prepare-playground
108120
- run: npx nx affected --target=build --parallel=3 --verbose
109121

@@ -128,6 +140,8 @@ jobs:
128140
runs-on: ubuntu-latest
129141
steps:
130142
- uses: actions/checkout@v4
143+
with:
144+
submodules: true
131145
- uses: ./.github/actions/prepare-playground
132146
- run: npm run build:docs
133147
- uses: actions/upload-pages-artifact@v1

.github/workflows/publish-npm-packages.yml

+1
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ jobs:
4040
ref: ${{ github.event.pull_request.head.ref }}
4141
clean: true
4242
persist-credentials: false
43+
submodules: true
4344
- name: Config git user
4445
run: |
4546
git config --global user.name "deployment_bot"

.github/workflows/refresh-sqlite-integration.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,12 @@ jobs:
2323
concurrency:
2424
group: check-version-and-run-build
2525
steps:
26-
- uses: actions/checkout@v3
26+
- uses: actions/checkout@v4
2727
with:
2828
ref: ${{ github.event.pull_request.head.ref }}
2929
clean: true
3030
persist-credentials: false
31+
submodules: true
3132
- uses: ./.github/actions/prepare-playground
3233
- name: 'Refresh the SQLite bundle'
3334
shell: bash

.github/workflows/refresh-wordpress-major-and-beta.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,12 @@ jobs:
2828
concurrency:
2929
group: check-version-and-run-build
3030
steps:
31-
- uses: actions/checkout@v3
31+
- uses: actions/checkout@v4
3232
with:
3333
ref: ${{ github.event.pull_request.head.ref }}
3434
clean: true
3535
persist-credentials: false
36+
submodules: true
3637
- name: 'Install bun'
3738
run: |
3839
curl -fsSL https://bun.sh/install | bash

.github/workflows/refresh-wordpress-nightly.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,12 @@ jobs:
2121
environment:
2222
name: wordpress-assets
2323
steps:
24-
- uses: actions/checkout@v3
24+
- uses: actions/checkout@v4
2525
with:
2626
ref: ${{ github.event.pull_request.head.ref }}
2727
clean: true
2828
persist-credentials: false
29+
submodules: true
2930
- uses: ./.github/actions/prepare-playground
3031
- name: 'Install bun'
3132
run: |

.github/workflows/update-changelog.yml

+3-2
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,15 @@ jobs:
3131
env:
3232
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
3333
steps:
34-
- uses: actions/checkout@v3
34+
- uses: actions/checkout@v4
3535
with:
36+
submodules: true
3637
ref: trunk
3738
clean: true
3839
persist-credentials: false
3940
- name: Fetch trunk
4041
shell: bash
41-
run: git fetch origin trunk --depth=1
42+
run: git fetch origin trunk --depth=1 --recurse-submodules
4243
- name: 'Install bun (for the changelog)'
4344
run: |
4445
curl -fsSL https://bun.sh/install | bash

.gitmodules

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "isomorphic-git"]
2+
path="isomorphic-git"
3+
url=[email protected]:adamziel/isomorphic-git.git

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -84,15 +84,15 @@ The vanilla `git clone` command will take ages. Here's a faster alternative that
8484
only pull the latest revision of the trunk branch:
8585

8686
```
87-
git clone -b trunk --single-branch --depth 1 [email protected]:WordPress/wordpress-playground.git
87+
git clone -b trunk --single-branch --depth 1 --recurse-submodules [email protected]:WordPress/wordpress-playground.git
8888
```
8989

9090
## Running WordPress Playground locally
9191

9292
You also can run WordPress Playground locally as follows:
9393

9494
```bash
95-
git clone -b trunk --single-branch --depth 1 [email protected]:WordPress/wordpress-playground.git
95+
git clone -b trunk --single-branch --depth 1 --recurse-submodules [email protected]:WordPress/wordpress-playground.git
9696
cd wordpress-playground
9797
npm install
9898
npm run dev

isomorphic-git

Submodule isomorphic-git added at cdca7e5

package-lock.json

+80
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

+6
Original file line numberDiff line numberDiff line change
@@ -61,21 +61,27 @@
6161
"@types/react-transition-group": "4.4.11",
6262
"@types/wicg-file-system-access": "2023.10.5",
6363
"ajv": "8.12.0",
64+
"async-lock": "1.4.1",
6465
"axios": "1.6.1",
6566
"classnames": "^2.3.2",
6667
"comlink": "^4.4.1",
68+
"crc-32": "1.2.2",
69+
"diff3": "0.0.4",
6770
"express": "4.19.2",
6871
"file-saver": "^2.0.5",
6972
"fs-extra": "11.1.1",
7073
"ini": "4.1.2",
7174
"octokit": "3.1.1",
7275
"octokit-plugin-create-pull-request": "5.1.1",
76+
"pako": "1.0.10",
7377
"react": "^18.2.25",
7478
"react-dom": "^18.2.25",
7579
"react-hook-form": "7.53.0",
7680
"react-modal": "^3.16.1",
7781
"react-redux": "8.1.3",
7882
"react-transition-group": "4.4.5",
83+
"sha.js": "2.4.11",
84+
"sha1": "1.1.1",
7985
"unzipper": "0.10.11",
8086
"vite-plugin-api": "1.0.4",
8187
"wouter": "3.3.5",

packages/docs/site/docs/developers/23-architecture/18-host-your-own-playground.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ The most flexible and customizable method is to build the site locally.
5858
Create a shallow clone of the Playground repository, or your own fork.
5959

6060
```sh
61-
git clone -b trunk --single-branch --depth 1 [email protected]:WordPress/wordpress-playground.git
61+
git clone -b trunk --single-branch --depth 1 --recurse-submodules [email protected]:WordPress/wordpress-playground.git
6262
```
6363

6464
Enter the `wordpress-playground` directory.

packages/docs/site/docs/main/contributing/code.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Be sure to review the following resources before you begin:
2626
[Fork the Playground repository](https://github.com/WordPress/wordpress-playground/fork) and clone it to your local machine. To do that, copy and paste these commands into your terminal:
2727

2828
```bash
29-
git clone -b trunk --single-branch --depth 1
29+
git clone -b trunk --single-branch --depth 1 --recurse-submodules
3030

3131
# replace `YOUR-GITHUB-USERNAME` with your GitHub username:
3232
[email protected]:YOUR-GITHUB-USERNAME/wordpress-playground.git

packages/playground/components/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,6 @@
77
</head>
88
<body>
99
<div id="root"></div>
10-
<script type="module" src="./src/demos.tsx"></script>
10+
<script type="module" src="./src/demos/index.tsx"></script>
1111
</body>
1212
</html>

0 commit comments

Comments
 (0)