Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support dynamic requires #2310

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open

support dynamic requires #2310

wants to merge 27 commits into from

Conversation

boneskull
Copy link
Contributor

@boneskull boneskull commented Jun 7, 2024

Description

  • This PR adds support for dynamic requires via loadFromMap() and link() (in import-lite.js and link.js, respectively). importLocation()'s signature has also been modified for support.

    To use this feature in either function, the following must be true:

    1. The moduleTransforms option (in the appropriate options parameter) must not be present. These are asynchronous module transforms, which cannot be used by dynamic require.
    2. The ReadPowers param must be a proper ReadPowers object (not just a ReadFn) and must contain both the new maybeReadSync, new isAbsolute, and fileURLToPath functions. The new type representing this is SyncReadPowers.

    If all of the above are true, then a compartment will be allowed to dynamically require something. If that thing still cannot be found in the compartment map, the sync "fallback" exit hook (see next item) will be executed.

  • The new importHookNow property can be provided via options, which is a synchronous exit module import hook.

  • About SyncReadPowers:

    • SyncReadPowers.maybeReadSync() is necessary to read files synchronously which is necessary to load them synchronously. This fails gracefully if the file is not found; the implementation is platform-dependent.
    • SyncReadPowers.isAbsolute() is necessary to determine if the module specifier of a dynamic require is absolute. If it is, it could be just about anything, and must be loaded via the user-provided importNowHook (the sync exit module import hook).
    • SyncReadPowers.fileURLToPath() is needed for __filename and __dirname, which is often used to create absolute paths for requiring dynamically.
  • As an alternative to moduleTransforms, synchronous module transforms may be provided via the new syncModuleTransforms object. In a non-dynamic-require use-case, if present, syncModuleTransforms are combined with the moduleTransforms option; all sync module transforms are module transforms, but not all module transforms are sync module transforms.

  • All builtin parsers are now synchronous. User-defined parsers can be async, but this will disable dynamic require support.

  • @endo/evasive-transform now exports evadeCensorSync() in addition to evadeCensor(). This is possible because I've swapped the async-only source-map with source-map-js, which is a fork of the former before it went async-only. source-map-js claims comparable performance.

Security Considerations

Dynamically requiring an exit module (e.g., a Node.js builtin) requires a user-defined hook, which has the same security considerations as a user-defined exit module hook.

Swapping out a dependency (source-map-js for source-map) incurs risk.

Scaling Considerations

n/a

Documentation Considerations

Should be announced as a user-facing feature

Testing Considerations

I've added some fixtures and tested around the conditionals I've added, but am open to any suggestions for additional coverage.

Compatibility Considerations

This increases ecosystem compatibility considerably; use of dynamic require in the ecosystem is not rare.

For example, most packages which ship a native module will be using dynamic require, because the filepath of the build artifact is dependent upon the platform and architecture.

Upgrade Considerations

Everything else should be backwards-compatible, as long as source-map-js does as it says on the tin.

Users of @endo/evasive-transform may note that native modules are neither downloaded/compiled (due to the switch from source-map to source-map-js).


UPDATE: Aug 18 2024

This PR now targets the boneskull/supertramp branch, which is PR #2263

@boneskull boneskull self-assigned this Jun 18, 2024
@boneskull boneskull marked this pull request as ready for review June 18, 2024 20:57
@boneskull
Copy link
Contributor Author

boneskull commented Jun 18, 2024

If anyone can point me in the direction of why the tests are failing, that'd be helpful; otherwise I'll just plug at it. Plugged.

compartmentDescriptor.modules = moduleDescriptors;

let { policy } = compartmentDescriptor;
policy = policy || Object.create(null);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for typescript

Comment on lines 483 to 655
if ('packages' in policy && typeof policy.packages === 'object') {
for (const [pkgName, policyItem] of entries(policy.packages)) {
if (
!(pkgName in compartmentDescriptor.modules) &&
pkgName in compartmentDescriptor.scopes &&
policyItem
) {
compartmentDescriptor.modules[pkgName] =
compartmentDescriptor.scopes[pkgName];
}
}
}
Copy link
Contributor Author

@boneskull boneskull Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure if this is correct, but it creates links between compartment descriptors based only on policy--where Endo would not have detected them otherwise.

Checking if dynamic: true is in policy at this point causes many tests to fail, because this ends up being a code path which many tests take. Why? Because the behavior is opt-out.

With regards to that, in addition to the policy, perhaps we should add a dynamic option to ImportLocationOptions or whathaveyou, so you can explicitly opt-in to using importNowHook. Right now, it's an opt-out based on the lack of a moduleTransforms (async module transforms) option and the shape of readPowers. It's not until later that we take the policy into account. Another way to put it: we're creating an importNowHook because a) we have the tools to do so, and b) something might dynamically require something else.

If we did that, I'd be able to delete some LoC, and we'd reduce the risk of introducing backwards-incompatible changes. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less code and better compatibility. Sounds good.
Also, explicit opt-in would allow an early failure if ReadPowers or transforms don't match the intent to enable dynamic require.

Comment on lines 507 to 602
// Collate candidate locations for the moduleSpecifier,
// to support Node.js conventions and similar.
const candidates = [moduleSpecifier];
for (const candidateSuffix of searchSuffixes) {
candidates.push(`${moduleSpecifier}${candidateSuffix}`);
}

for (const candidateSpecifier of candidates) {
const candidateModuleDescriptor = moduleDescriptors[candidateSpecifier];
if (candidateModuleDescriptor !== undefined) {
const { compartment: candidateCompartmentName = packageLocation } =
candidateModuleDescriptor;
const candidateCompartment = compartments[candidateCompartmentName];
if (candidateCompartment === undefined) {
throw Error(
`compartment missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
// modify compartmentMap to include this redirect
const candidateCompartmentDescriptor =
compartmentDescriptors[candidateCompartmentName];
if (candidateCompartmentDescriptor === undefined) {
throw Error(
`compartmentDescriptor missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
candidateCompartmentDescriptor.modules[moduleSpecifier] =
candidateModuleDescriptor;
// return a redirect
/** @type {RedirectStaticModuleInterface} */
const record = {
specifier: candidateSpecifier,
compartment: candidateCompartment,
};
return record;
}

// Using a specifier as a location.
// This is not always valid.
// But, for Node.js, when the specifier is relative and not a directory
// name, they are usable as URL's.
const moduleLocation = resolveLocation(
candidateSpecifier,
packageLocation,
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copypasta

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of duplication. I think this is worth trying to bounce on a trampoline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This is a good idea, but is also unfortunate because it will take me longer to land this.

Comment on lines 554 to 613
let moduleBytes;
try {
moduleBytes = readSync(moduleLocation);
} catch (err) {
if (err && err.code === 'ENOENT') {
// might be an exit module. use the fallback `exitModuleImportNowHook` to import it
// eslint-disable-next-line no-continue
continue;
}
throw err;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we know something is an exit module? maybe we should check the compartment descriptor, and only use the fallback if the thing doesn't exist there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not arrived at a hard decision for this question. Falling through the bottom of the import hook should be enough, but might not be. An option we did not have when I first wrote this: we can identify “host modules” by the presence of a prefix: like node: or endo:. These would be guaranteed to escape the Node.js style mappings.

In the interest of keeping coupling to a specific specifier resolution strategy low, I am leaning heavily toward “pass through if nothing in the compartment map matches.”

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prefix option would be really unfortunate for ecosystem compatibility. I don't think we'll ever get all packages to switch form fs to node:fs.

For the sake of completion: if importHook for exit modules is not provided or returns nothing we still need to fail with an import error.

Comment on lines +707 to +747
/**
* @typedef FsPromisesApi
* @property {(filepath: string) => Promise<string>} realpath
* @property {WriteFn} writeFile
* @property {ReadFn} readFile
*/

/**
* @typedef FsAPI
* @property {FsPromisesApi} promises
* @property {ReadSyncFn} readFileSync
*/

/**
* @typedef UrlAPI
* @property {(location: string | URL) => string} fileURLToPath
* @property {(path: string) => URL} pathToFileURL
*/

/**
* @typedef CryptoAPI
* @property {typeof import('crypto').createHash} createHash
*/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not strictly necessary, but I found it helpful. YMMV

Copy link
Member

@kriskowal kriskowal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary feedback.

packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
Comment on lines 507 to 602
// Collate candidate locations for the moduleSpecifier,
// to support Node.js conventions and similar.
const candidates = [moduleSpecifier];
for (const candidateSuffix of searchSuffixes) {
candidates.push(`${moduleSpecifier}${candidateSuffix}`);
}

for (const candidateSpecifier of candidates) {
const candidateModuleDescriptor = moduleDescriptors[candidateSpecifier];
if (candidateModuleDescriptor !== undefined) {
const { compartment: candidateCompartmentName = packageLocation } =
candidateModuleDescriptor;
const candidateCompartment = compartments[candidateCompartmentName];
if (candidateCompartment === undefined) {
throw Error(
`compartment missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
// modify compartmentMap to include this redirect
const candidateCompartmentDescriptor =
compartmentDescriptors[candidateCompartmentName];
if (candidateCompartmentDescriptor === undefined) {
throw Error(
`compartmentDescriptor missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
candidateCompartmentDescriptor.modules[moduleSpecifier] =
candidateModuleDescriptor;
// return a redirect
/** @type {RedirectStaticModuleInterface} */
const record = {
specifier: candidateSpecifier,
compartment: candidateCompartment,
};
return record;
}

// Using a specifier as a location.
// This is not always valid.
// But, for Node.js, when the specifier is relative and not a directory
// name, they are usable as URL's.
const moduleLocation = resolveLocation(
candidateSpecifier,
packageLocation,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of duplication. I think this is worth trying to bounce on a trampoline.

packages/compartment-mapper/src/import-lite.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-lite.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-lite.js Outdated Show resolved Hide resolved
packages/evasive-transform/src/location-unmapper.js Outdated Show resolved Hide resolved
@boneskull
Copy link
Contributor Author

Going to extract the changes to @endo/evasive-transform into a separate PR.

@boneskull
Copy link
Contributor Author

Ref: #2332

cc @kriskowal

@boneskull
Copy link
Contributor Author

Once #2332 is merged, I can rebase this onto master, which should eliminate the commit from this PR's history. So let's sit on it until then.

@boneskull boneskull force-pushed the boneskull/dynamic branch 3 times, most recently from 04f003b to b09a0cd Compare August 15, 2024 23:07
@boneskull boneskull changed the base branch from master to boneskull/supertramp August 15, 2024 23:08
@naugtur
Copy link
Member

naugtur commented Aug 19, 2024

note on source-map-js

I don't see where the source-map-js is added. The documentation says it's a fork of source-map but with some optimizations added. One of the optimizations listed in the article they cite as source of the optimizations is using a Function constructor to create variants of sorting function. That's a potential concern (less of a security concern, more of a compatibility concern if Endo code is supposed to run correctly under lockdown)

boneskull and others added 24 commits September 19, 2024 12:17
Is this a breaking change?  I don't know.

This doesn't mean that parsers _cannot_ be async--just rather that the ones we have are not.
This adds a new prop to `ReadPowers`: `isAbsolute()`. This function is used to load such a module specifier as an exit module in `importNowHook`, since it could be pointing to anything.
- improve error messages
- better tests to support the `node-gyp-build` use-case
- create `makeSyncReadPowers()` as not to befoul things depending on what `makeReadPowers()` does
- remove `dynamic` flag from policy; if we need it, we can use `options`
This decouples platform-dependent error handling.

Note that the `ReadSyncFn` _type_ is still in use in the `FsAPI` type.
This removes the misbegotten `mapParsersSync` (and its vile spawn) and moves `mapParsers` and `makeExtensionParser` into `map-parser.js`.

`mapParsers` was refactored to use a trampoline to run in both sync and async contexts.
…own paths

This changes the logic so that this particular class of dynamic requires is disallowed.

If the specifier is not accessible per policy, the usual policy-enforcement error message applies.  If the specifier is not accessible because it's nowhere in the exit module, the error will mention that it can't load an unknown module.

Also fixes:

- dynamic requires of exit modules (e.g., Node.js builtins)
- dynamic requires made from the entry compartment descriptor
- some peer review comments
Comment on lines +229 to +233
// if we have nothing in the moduleTransforms object, then we can use the syncParser.
if (keys(moduleTransforms).length === 0) {
transforms = syncModuleTransforms;
return syncParser;
}
Copy link
Contributor Author

@boneskull boneskull Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect.

Just because moduleTransforms is empty doesn't mean that the parsers are sync. We don't have enough information in mapParsers() to make that determination.

We do, however, have the information in link() (which calls mapParsers()). I'm not sure what's the best way to refactor this though. Send mapParsers() a sync bit? idk

@naugtur

UPDATE: link() doesn't even have it, as written. The presence of a makeImportNowHook is the only way link() could know whether to pursue a synchronous strategy. As written, this information is lost. But if we change it, and if that is sufficient, we could use it to influence the behavior of mapParsers()... or rather something like what I suggest here.

Comment on lines +191 to +196
// sync module transforms are allowed, because they are "compatible"
// with async module transforms (not vice-versa)
const moduleTransforms = /** @type {ModuleTransforms} */ ({
...behavior.options.syncModuleTransforms,
...behavior.options.moduleTransforms,
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be removed. right @naugtur?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this was supposed to happen in map-parser.js only, not sure how it got here

Comment on lines +170 to +178
makeImportNowHook = makeImportNowHookMaker(
/** @type {SyncReadPowers} */ (readPowers),
entryCompartmentName,
{
compartmentDescriptors: compartmentMap.compartments,
searchSuffixes,
exitModuleImportNowHook,
},
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the only difference between the "sync" and "async" calls to link(); the presence of a makeImportNowHook.

If it's not the conditional we're using here, it's a conditional we're using somewhere else. I'm pretty sure it should live in here, because we have access to the options and readPowers and we need both to determine whether we need to create an makeImportNowHook.

We need some information that says "enable import now hooks", and that information needs to get from here, then through link(), and finally into mapParsers(). Adding a sync flag seems bogus, but maybe a makeLink() and makeMapParsers() functions could be a more elegant way to express what we need to do.

@naugtur

Copy link
Contributor Author

@boneskull boneskull Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A makeLink() could be used by creating a "default" link() which is cached and accessible to everything calling link(). This default link() would have been created by makeLink(mapParsers), where mapParsers is the (cached) "default" mapParsers() implementation, created by makeMapParsers().

And then if all the bits and bobs are in place to pursue a synchronous strategy, we could call makeLink(syncMapParsers) (this call would be in loadFromMap() here) with a syncMapParsers we created from makeMapParsers().

¯\(ツ)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of makeMapParsers that accepts hints about sync/async situation and returns the util to pass down to link dependency injection style.
That is if we won't find a simple correction of the current implementation. I'll look into it more tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants