Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --sandbox flag to prevent dynamic loading of other files/data. #3092

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jemc
Copy link

@jemc jemc commented Apr 12, 2024

Some use cases for jq may involve accepting untrusted input. See discussion in #1361 for some security considerations that may be relevant for those use cases.

This commit adds a --sandbox flag which is meant to mitigate one category of security issue with untrusted input: features of jq which are meant to let the jq filter access files/data other than the direct input data given to the CLI.

Specifically, the new --sandbox flag blocks the implicit loading of $HOME/.jq, and also blocks the use of import and include for loading other jq files.

If other features are added to jq in the future which allow for reading files/data as part of the filter syntax, it is intended that the --sandbox flag would also gate access to those.

Some use cases for `jq` may involve accepting untrusted input.
See discussion in jqlang#1361 for some security considerations that
may be relevant for those use cases.

This commit adds a `--sandbox` flag which is meant to mitigate
one category of security issue with untrusted input:
features of jq which are meant to let the jq filter access
files/data other than the direct input data given to the CLI.

Specifically, the new `--sandbox` flag blocks the implicit
loading of `$HOME/.jq`, and also blocks the use of
`import` and `include` for loading other `jq` files.

If other features are added to `jq` in the future which allow
for reading files/data as part of the filter syntax, it is
intended that the `--sandbox` flag would also gate access to those.
@itchyny
Copy link
Contributor

itchyny commented Apr 12, 2024

Should this flag block use of env?

@wader
Copy link
Member

wader commented Apr 13, 2024

Should this flag block use of env?

Good point, block i think. Would env and $ENV just be empty objects, null or not defined at all?

Any special consideration around the input* functions?

@jemc
Copy link
Author

jemc commented Apr 16, 2024

I will have a crack at updating the PR to make env and $ENV be empty objects when the sandbox flag is present.

Regarding the input* functions, my thinking was that because these related to letting the jq filter interact with the inputs that were explicitly given to the filter by the invoker, I saw this as being "within the sandbox".

@jemc
Copy link
Author

jemc commented Apr 16, 2024

I've updated the PR to clear the $ENV value and env builtin, including tests and documentation changes to match. Thanks for the feedback!

Let me know if there is anything else.

tests/shtest Outdated Show resolved Hide resolved
@itchyny
Copy link
Contributor

itchyny commented Apr 16, 2024

The input, inputs are safe, but I think wader is talking about input_filename and input_line_number.

In a security-sensitive environment where the `--sandbox` flag
can be used to mitigate some categories of threats from untrusted
filter code and/or untrusted JSON data, it is also desirable
to prevent leaking environment variable values (which often
can include secrets in some environments).

This commit does so by updating the behavior of `--sandbox` to
also clear the environment variables seen by the jq filter code
in the `$ENV` value and `env` builtin.
@jemc
Copy link
Author

jemc commented Apr 19, 2024

I've pushed an amended commit that fixes the copy/paste mistake in the test echo strings noticed by @emanuele6.

Regarding input_filename and input_line_number:

There's possibly an argument for saying that input_filename should hide the real filename(s) of the input, but I personally don't find it that compelling. However, if someone here wants to specify a special behavior for that case, I will happily implement it as specified. For reference, note that input_filename is the string "<stdin>" when STDIN is used as the input.

For input_line_number I don't think it's leaking any unexpected security-sensitive information. It lets you examine an aspect of the input which is outside of the scope of JSON data structures, but it's still part of the input, so I think it's definitely fair game for a sandboxed filter. Please note that it also works properly when used with STDIN input, so it's not a feature that needs to peek at a real filesystem file to work properly - it's only looking at the string of the input.

@jemc
Copy link
Author

jemc commented May 1, 2024

@itchyny @wader @emanuele6 - can any of you give this PR a deeper review and approve it (or give further comments on things to be improved)?

@jemc jemc requested a review from emanuele6 May 1, 2024 13:19
@emanuele6 emanuele6 removed their request for review May 1, 2024 13:20
@emanuele6
Copy link
Member

I am not personally particularly interested in this feature as it is implemented, so I can't comment much on it.

@itchyny
Copy link
Contributor

itchyny commented May 1, 2024

Sorry for waiting, but I think we need more discussion on this topic. Using seccomp(2) could be better, or simple chroot(2) might be enough. Yes, platform-independent block like this PR could be the best. For cli usage we already provide Docker image so that might be enough. Anyway I think we need to collect more opinions, we are unpaid volunteers with the fear of being blamed for security issues.

@wader
Copy link
Member

wader commented May 3, 2024

Good points @itchyny. It would be great to hear what @nicowilliams thinks and how it would affect other future sandbox efforts.

One thing might be to very clearly document what --sandbox actually means and also mention other methods to use if stronger isolation is needed like wasm, chroot, seccomp etc.

@jemc
Copy link
Author

jemc commented May 6, 2024

If there's a fear that --sandbox "implies too much" about the security properties of the feature, or otherwise create confusion, I could rename it to more specifically pinpoint what it actually does.

Naming is hard, but perhaps something like --disable-external-file-access would be less objectionable?

The main point here is that jq has some features which allow for external file access, and at minimum I just want a way to turn those features off.

@pkoppstein
Copy link
Contributor

In the interests of making some progress on this feature, which in some way or another has been discussed since at least March 2017, I'd like to propose that --sandbox be added as an experimental feature, or at least one with explicit indications that certain details are subject to change.

A related thought is that, since there are so many different considerations and criteria regarding "security" and "sandbox", the "sandbox" flag could refer to a JSON object that specifies which "security flags" are to be enabled. E.g. {"now": false, "env": false} would mean that access to now and env are disabled.

@jemc
Copy link
Author

jemc commented May 9, 2024

A related thought is that, since there are so many different considerations and criteria regarding "security" and "sandbox", the "sandbox" flag could refer to a JSON object that specifies which "security flags" are to be enabled. E.g. {"now": false, "env": false} would mean that access to now and env are disabled.

Are there any other existing examples in jq of a CLI option that takes a JSON object as its value? I didn't find one.

Seems more idiomatic to have separate flags that control separate features.

Again, I'm only interested in disabling the features that allow for arbitrary file access from within the filter, so if I need to rename this flag to only mention the file restriction, I'm happy to rename it. Other flags could be added for controlling other features.

I just need guidance from an authoritative maintainer on what the flag name and desired behavior should be, and I can adapt the PR to match.

@jemc
Copy link
Author

jemc commented Oct 11, 2024

It's been about 6 months, so I'm checking in again.

Any authoritative maintainer available to give direction on what changes are needed to get this across the finish line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants