Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaffold IPC-based API #711

Open
wants to merge 66 commits into
base: main
Choose a base branch
from
Open

Scaffold IPC-based API #711

wants to merge 66 commits into from

Conversation

andrewbranch
Copy link
Member

@andrewbranch andrewbranch commented Mar 25, 2025

Important

Until libsyncrpc is set up to publish to npm, this PR takes a git dependency on it, which will build the binary from source during npm install. You need Rust 1.85 or higher to have a successful npm install in typescript-go.

Note

Takeaways from design meeting:

  • Investigate some kind of query system for composing batched requests. Muffled screams of “Not GraphQL!” and “I want GraphQL!” could be heard on the other end of the line.
  • FFI via napi-go still on the table to investigate as an additional option (not replacing IPC), but may need a different API surface or refactoring since the one included here returns shallow serializable objects.
  • No consensus on whether remoting ASTs means we don’t need to provide a client-side parser. The performance of remoting is pretty good, but requires loading the large tsgo binary, and the perf maybe isn’t good enough for linters that need to remote thousands of ASTs. There are other JS parsers though, and not having to maintain one with identical behavior to the Go-based parser is seen as a real win.

This PR is the start of a JavaScript API client and Go API server that communicate over STDIO. Only a few methods are implemented; the aim of this PR is to be the basis for discussions around the general architecture, then additional functionality can be filled in.

Same backend, different clients

This PR includes a synchronous JavaScript client for Node.js. It uses libsyncrpc to block during IPC calls to the server. Relatively small changes to the client could produce an asynchronous variant without Node.js-specific native bindings that could work in Deno or Bun. I don’t want to make specific promises about WASM without doing those experiments, but using the same async client with an adapter for calling WASM exports seems possible. I’m imagining that eventually we’ll publish the Node.js-specific sync client as a standalone library for those who need a sync API, and an async version adaptable to other use cases, ideally codegen’d from the same source. The same backend is intended to be used with any out-of-process client.

Client structure

This PR creates two JavaScript packages, @typescript/ast and @typescript/api (which may make more sense as @typescript/api-sync or @typescript/api-node eventually). The former contains a copy of TS 5.9’s AST node definitions, related enums, and node tests (e.g. isIdentifier()), with the minor changes that TS 7 has made to those definitions applied. The latter contains the implementation of the Node.js API client. It currently takes a path to the tsgo executable and spawns it as a child process. (I imagine eventually, the TypeScript 7.0+ compiler npm package will be a peerDependency of the API client, and resolution of the executable can happen automatically.)

Backend structure

tsgo api starts the API server communicating over STDIO. The server initializes the api.API struct which is responsible for handling requests and managing state, like a stripped-down project.Service. In fact, it uses the other components of the project system, storing documents and projects the same way. (As the project service gets built out with things like file watchers and optimizations for find-all-references, it would get increasingly unwieldy to use directly as an API service, but a future refactor might extract the basic project and document storage to a shared component.)

The API already has methods that return projects, symbols, and types. These are returned as IDs plus bits of easily serializable info, like name and flags. When one of these objects is requested, the API server stores it with its ID so follow-up requests can be made against those IDs. This does create some memory management challenges, which I’ll discuss a bit later.

Implemented functionality

Here’s a selection of the API client type definitions that shows what methods exist as of this PR:

export interface APIOptions {
    tsserverPath: string;
    cwd?: string;
    logFile?: string;
    fs?: FileSystem;
}

export interface FileSystem {
    directoryExists?: (directoryName: string) => boolean | undefined;
    fileExists?: (fileName: string) => boolean | undefined;
    getAccessibleEntries?: (directoryName: string) => FileSystemEntries | undefined;
    readFile?: (fileName: string) => string | null | undefined;
    realpath?: (path: string) => string | undefined;
}

export declare class API {
    constructor(options: APIOptions);
    parseConfigFile(fileName: string): ConfigResponse;
    loadProject(configFileName: string): Project;
}

export interface ConfigResponse {
    options: Record<string, unknown>;
    fileNames: string[];
}

export declare class Project {
    configFileName: string;
    compilerOptions: Record<string, unknown>;
    rootFiles: readonly string[];

    reload(): void;
    getSourceFile(fileName: string): SourceFile | undefined;
    getSymbolAtLocation(node: Node): Symbol | undefined;
    getSymbolAtLocation(nodes: readonly Node[]): Symbol | undefined;
    getSymbolAtPosition(fileName: string, position: number): Symbol | undefined;
    getSymbolAtPosition(fileName: string, positions: readonly number[]): (Symbol | undefined)[];
    getTypeOfSymbol(symbol: Symbol): Type | undefined;
    getTypeOfSymbol(symbols: readonly Symbol[]): (Type | undefined)[];
}

export interface Node {
    readonly id: number;
    readonly pos: number;
    readonly end: number;
    readonly kind: SyntaxKind;
    readonly parent: Node;
    forEachChild<T>(visitor: (node: Node) => T): T | undefined;
    getSourceFile(): SourceFile;
}

export interface SourceFile extends Node {
    readonly kind: SyntaxKind.SourceFile;
    // Node types are basically same as Strada, without additional methods
    readonly statements: NodeArray<Statement>;
    readonly text: string;
}

export declare class Symbol {
    id: string;
    name: string;
    flags: SymbolFlags;
    checkFlags: number;
}

export declare class Type {
    flags: TypeFlags;
}

Here’s some example usage from benchmarks:

import { API } from "@typescript/api";
import { SyntaxKind } from "@typescript/ast";

const api = new API({
    cwd: new URL("../../../", import.meta.url).pathname,
    tsserverPath: new URL("../../../built/local/tsgo", import.meta.url).pathname,
});

const project = api.loadProject("_submodules/TypeScript/src/compiler/tsconfig.json");
const file = project.getSourceFile("program.ts")!;

file.forEachChild(function visit(node) {
  if (node.kind === SyntaxKind.Identifier) {
    const symbol = project.getSymbolAtPosition("program.ts", node.pos);
    // ...
  }
  node.forEachChild(visit);
});

Client-side virtual file systems are also supported. There’s a helper for making a very simple one from a record:

import { API } from "@typescript/api";
import { createVirtualFileSystem } from "@typescript/api/fs";
import { SyntaxKind } from "@typescript/ast";

const api = new API({
    cwd: new URL("../../../", import.meta.url).pathname,
    tsserverPath: new URL("../../../built/local/tsgo", import.meta.url).pathname,
    fs: createVirtualFileSystem({
        "/tsconfig.json": "{}",
        "/src/index.ts": `import { foo } from './foo';`,
        "/src/foo.ts": `export const foo = 42;`,
    }),
});

Performance

These are the results of the included benchmarks on my M2 Mac. Note that IPC is very fast on Apple Silicon, and Windows seems to see significantly more overhead per call. Tasks prefixed TS - refer to the rough equivalent with the TypeScript 5.9 API. The getSymbolAtPosition tasks are operating on TypeScript’s program.ts, which has 10893 identifiers.

┌─────────┬─────────────────────────────────────────────────────┬─────────────────────┬───────────────────────────┬────────────────────────┬────────────────────────┬─────────┐
│ (index) │ Task name                                           │ Latency avg (ns)    │ Latency med (ns)          │ Throughput avg (ops/s) │ Throughput med (ops/s) │ Samples │
├─────────┼─────────────────────────────────────────────────────┼─────────────────────┼───────────────────────────┼────────────────────────┼────────────────────────┼─────────┤
│ 0       │ 'spawn API'                                         │ '3811417 ± 2.32%'   │ '3562750 ± 137292.00'     │ '268 ± 1.51%'          │ '281 ± 11'             │ 263     │
│ 1       │ 'echo (small string)'                               │ '10145 ± 0.30%'     │ '8792.0 ± 2375.00'        │ '116283 ± 0.24%'       │ '113740 ± 32663'       │ 98570   │
│ 2       │ 'echo (large string)'                               │ '802872 ± 1.27%'    │ '783375 ± 82458.00'       │ '1285 ± 0.86%'         │ '1277 ± 133'           │ 1246    │
│ 3       │ 'echo (small Uint8Array)'                           │ '11170 ± 0.40%'     │ '9750.0 ± 2458.00'        │ '104702 ± 0.24%'       │ '102564 ± 27325'       │ 89529   │
│ 4       │ 'echo (large Uint8Array)'                           │ '540871 ± 1.95%'    │ '498542 ± 71500.00'       │ '1989 ± 0.99%'         │ '2006 ± 285'           │ 1849    │
│ 5       │ 'load project'                                      │ '8275640 ± 19.35%'  │ '7099958 ± 337374.00'     │ '136 ± 2.83%'          │ '141 ± 7'              │ 121     │
│ 6       │ 'load project (client FS)'                          │ '95038294 ± 5.86%'  │ '93161042 ± 7466791.50'   │ '11 ± 3.38%'           │ '11 ± 1'               │ 64      │
│ 7       │ 'TS - load project'                                 │ '380236148 ± 1.55%' │ '375040917 ± 14878416.50' │ '3 ± 1.47%'            │ '3 ± 0'                │ 64      │
│ 8       │ 'transfer debug.ts'                                 │ '732232 ± 3.04%'    │ '681083 ± 12334.00'       │ '1434 ± 0.56%'         │ '1468 ± 27'            │ 1366    │
│ 9       │ 'transfer program.ts'                               │ '2660346 ± 4.04%'   │ '2431708 ± 53875.00'      │ '395 ± 1.33%'          │ '411 ± 9'              │ 377     │
│ 10      │ 'transfer checker.ts'                               │ '27890882 ± 2.28%'  │ '26820333 ± 433416.50'    │ '36 ± 1.92%'           │ '37 ± 1'               │ 64      │
│ 11      │ 'materialize program.ts'                            │ '1418572 ± 2.33%'   │ '1376500 ± 10417.00'      │ '715 ± 0.46%'          │ '726 ± 5'              │ 705     │
│ 12      │ 'materialize checker.ts'                            │ '23262525 ± 18.32%' │ '21435625 ± 1205000.00'   │ '47 ± 3.34%'           │ '47 ± 3'               │ 64      │
│ 13      │ 'getSymbolAtPosition - one location'                │ '11805 ± 1.84%'     │ '11042 ± 959.00'          │ '88571 ± 0.11%'        │ '90563 ± 8202'         │ 84712   │
│ 14      │ 'TS - getSymbolAtPosition - one location'           │ '918.37 ± 0.36%'    │ '917.00 ± 1.00'           │ '1098668 ± 0.01%'      │ '1090513 ± 1191'       │ 1088885 │
│ 15      │ 'getSymbolAtPosition - 10893 identifiers'           │ '140520504 ± 1.44%' │ '138652312 ± 2446271.00'  │ '7 ± 1.13%'            │ '7 ± 0'                │ 64      │
│ 16      │ 'getSymbolAtPosition - 10893 identifiers (batched)' │ '27321398 ± 4.48%'  │ '26289875 ± 672916.50'    │ '37 ± 2.24%'           │ '38 ± 1'               │ 64      │
│ 17      │ 'getSymbolAtLocation - 10893 identifiers'           │ '130045149 ± 1.26%' │ '128063583 ± 2391709.00'  │ '8 ± 1.09%'            │ '8 ± 0'                │ 64      │
│ 18      │ 'getSymbolAtLocation - 10893 identifiers (batched)' │ '20973507 ± 6.26%'  │ '19680167 ± 432749.50'    │ '49 ± 2.82%'           │ '51 ± 1'               │ 64      │
│ 19      │ 'TS - getSymbolAtLocation - 10893 identifiers'      │ '12942349 ± 27.49%' │ '11076708 ± 130521.00'    │ '89 ± 2.62%'           │ '90 ± 1'               │ 78      │
└─────────┴─────────────────────────────────────────────────────┴─────────────────────┴───────────────────────────┴────────────────────────┴────────────────────────┴─────────┘

To editorialize these numbers a bit: in absolute terms, this is pretty fast, even transferring large payloads like a binary-encoded checker.ts (10). On the order of tens, hundreds, or thousands of API calls, most applications probably wouldn’t notice a per-call regression over using the TypeScript 5.9 API, and may speed up if program creation / parsing multiple files is a significant portion of their API consumption today (5–7). However, the IPC overhead is pretty noticeable when looking at hundreds of thousands of back-to-back calls on an operation that would be essentially free in a native JavaScript API, like getting the symbol for every identifier in a large file (15, 18). For that reason, we’ll be very open to including bulk/batch/composite API methods that reduce the number of round trips needed to retrieve lots of information for common scenarios (16, 17).

Memory management

The current API design uses opaque IDs for objects like symbols and types, so the client can receive a handle to one of these objects and then query for additional information about it. For example, implemented in this PR is getTypeOfSymbol, which takes a symbol ID. The server has to store the symbol in a map so it can be quickly retrieved when the client asks for its type. This client/server split presents two main challenges:

  1. When the client makes two calls that result in the same symbol or same type, the client should return the strict-equal same object, while allowing garbage collection to work on those objects.
  2. When one of those client objects goes out of scope, it should eventually be released from the server, so server memory doesn’t grow indefinitely.

To accomplish this, there is a client-side object registry that stores objects by their IDs. API users will need to explicitly dispose those objects to release them both from the client-side store and from the server. (Server objects may be automatically released in response to program updates, and making additional queries against them will result in an error.) This can be done with the .dispose() method:

{
  const symbol = project.getSymbolAtPosition("program.ts", 0);
  if (symbol) {
    // ...
  }
  symbol.dispose();
}

or with explicit resource management:

{
  using symbol = project.getSymbolAtPosition("program.ts", 0);
  if (symbol) {
    // ...
  }
}

@andrewbranch andrewbranch marked this pull request as ready for review April 2, 2025 21:52
@andrewbranch andrewbranch requested a review from Copilot April 2, 2025 21:52
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR scaffolds an IPC-based API by introducing a Node.js API client and a Go API server along with test cases, benchmarks, and support for virtual file systems. Key changes include the implementation of new JavaScript packages for AST definitions and API functionality, integration with libsyncrpc for synchronous IPC, and the addition of extensive benchmarks and CI workflow updates.

Reviewed Changes

Copilot reviewed 54 out of 59 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
_packages/api/test/api.test.ts New API tests for configuration parsing, symbol resolution, and disposal.
_packages/api/src/typeFlags.ts Added TypeFlags definition with runtime enum-like functionality.
_packages/api/src/typeFlags.enum.ts Introduced TypeFlags as a TypeScript enum.
_packages/api/src/symbolFlags.ts Added SymbolFlags with runtime enum implementation.
_packages/api/src/symbolFlags.enum.ts Introduced SymbolFlags as a TypeScript enum.
_packages/api/src/proto.ts Defined the protocol interfaces for transmitting API responses.
_packages/api/src/path.ts Implemented path utilities for processing file paths and URLs.
_packages/api/src/objectRegistry.ts Provides object caching and release handling for projects, symbols, and types.
_packages/api/src/fs.ts Added a virtual file system implementation with directory/file operations.
_packages/api/src/client.ts Implemented a client for synchronous IPC communication via libsyncrpc.
_packages/api/src/api.ts Implemented the API client including overloaded methods for symbol and type retrieval.
_packages/api/bench/api.bench.ts Added benchmarks to compare API performance with TypeScript APIs.
README.md Updated build instructions to require Rust along with Go and Node.js.
Herebyfile.mjs Updated tasks to include API test and build steps.
.github/workflows/ci.yml Updated CI workflows with Rust toolchain usage and new environment flags.
Files not reviewed (5)
  • _packages/api/bench/tsconfig.json: Language not supported
  • _packages/api/package.json: Language not supported
  • _packages/api/test/tsconfig.json: Language not supported
  • _packages/api/tsconfig.json: Language not supported
  • _packages/ast/package.json: Language not supported

Copy link
Member

@jakebailey jakebailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all works locally for me, nice. It's a shame npm doesn't tell you what's taking so long to build on install, but oh well.

Going to go test this on Windows where I don't have rust installed to see what happens there.

@jakebailey
Copy link
Member

At this point I think this looks good; I'd only want to make sure everyone knows they have to install rustup (or msrustup?) to keep things working, unless we're able to find a way to make that build lazy somehow.

import fs from "node:fs";
import path from "node:path";
import { fileURLToPath } from "node:url";
import { Bench } from "tinybench";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a way I wish were just using vitest here (since it has tinybench built-in), but it's all pretty similar.

Copy link
Member

@jakebailey jakebailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have any notes, I'm just exited to get it in and start playing with it.

@johnnyreilly
Copy link

This is amazing work @andrewbranch!

Myself and @acutmore are working on a blog post about some of the wider implications of the Go port. (@jakebailey has been kind enough to eyeball a draft I believe.) In that we talk about ts-loader and ts-blank-space. From offline chats it sounds like ts-loader would probably not work with the new Go API in type checking mode, as ts-loader goes deep into the guts of the TypeScript APIs. However, transpileOnly mode is probably fine as it really only uses transpileModule.

I was digging through this PR and didn't spot a transpileModule API; is that because it's not implemented as yet? Or do I have the wrong end of the stick?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants