Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: trusted files redesign #578

Open
pwmarcz opened this issue May 11, 2022 · 8 comments
Open

RFC: trusted files redesign #578

pwmarcz opened this issue May 11, 2022 · 8 comments
Labels
enhancement New feature or request P: 2

Comments

@pwmarcz
Copy link
Contributor

pwmarcz commented May 11, 2022

NOTE: I will not be implementing this; this is me leaving notes/plans for possible next implementer.

Summary

Make trusted files configurable per mount, and move handling to LibOS, similar to what I did for protected files (#371).

In addition, trusted files will be renamed to hashed files.

This has the following benefits:

  • Architecture: PAL will be simpler (it will only handle regular files), and the feature will be usable (and testable) in all PALs.
  • UX: More consistent syntax for the user. The user will select one of three mounting modes (passthrough, hashed, encrypted).

Proposed changes

Old syntax:

fs.mounts = [
   { path = "/usr/lib", uri = "file:/usr/lib" },
]

sgx.allowed_files = [
   "file:/usr/lib/allowed"
]

sgx.trusted_files = [
   "file:/usr/lib",
   "file:/usr/lib/libpython.so",
   { uri = "file:/usr/lib/libpython.so", sha256 = "..." },
]

New syntax:

fs.mounts = [
   { type = "hashed", path = "/usr/lib", uri = "file:/usr/lib" },
   { type = "passthrough", path = "/usr/lib/allowed", uri = "file:/usr/lib/allowed" },
]

fs.hashed_files = [
   "file:/usr/lib",
   "file:/usr/lib/libpython.so",
   { uri = "file:/usr/lib/libpython.so", sha256 = "..." },
]
  • sgx.trusted_files is simply renamed to fs.hashed_files (during the deprecation period, we respect both settings).

  • gramine-manifest computes the hashes when building *.manifest (not only *.manifest.sgx). After a deprecation period, *.manifest.sgx will not be necessary!

  • hashed becomes a separate mount type (internally called chroot_hashed, same as chroot_encrypted). All files under a hashed mount have to be on the fs.hashed_files list. If a host file is not on that list, or has a different hash, it's not visible on guest. (Alternatively, a different hash might produce -EACCES).

  • To allow accessing a file directly, mount it with type = "passthrough". This replaces sgx.allowed_files.

  • PAL still needs to load LibOS, which needs to be hashed. So instead of trusted files, we will supply a hash for it directly in manifest:

loader.entrypoint = "file:.../libsysdb.so"
loader.entrypoint_sha256 = "..."
  • There is no sgx.file_check_policy = "allow_all_but_log".

  • The default mount type (chroot) is deprecated, and it works in a sort of "compatibility mode":

    • if the PAL type is Linux-SGX, it respects sgx.trusted_files and sgx.allowed_files,
    • otherwise, it functions the same as passthrough.
  • The root mount is still type = "chroot" by default, but a warning is displayed, encouraging the user to set fs.root.type = "hashed".

Implementation

I think the implementation can follow the encrypted files: there should be two layers, filesystem (handling inodes, handles etc.) and library (handling the actual logic of verifying trusted files).

FS layer:

  • Create a separate filesystem (chroot/hashed.c).
  • Note that this filesystem is read-only: it handles read and stat, but not creat/chmod/truncate/unlink.
  • Handle directories same as "normal" filesystem.
  • In inodes, store a struct shim_hashed_file object. On lookup, use the library (see below) to create this object and verify file hash.
  • File handles contain a PAL handle, same as in normal chroot filesystem, but reading is done through the library (see below).

Library:

  • This can be a module (shim_fs_hashed.c) similar to shim_fs_encrypted.c.
  • This module doesn't use the rest of LibOS, just interfaces with PAL to load files.
  • Suggested API:
/* `shim_fs_hashed.c` */

/* 256-bit hash of the whole file (SHA256) */
typedef ... file_hash_t;

/* 128-bit hashes of individual chunk (SHA256 truncated to 128 bits) */
typedef ... chunk_hash_t;

/* See TRUSTED_CHUNK_SIZE in current PAL */
#define CHUNK_SIZE (1 << 14)

/*
 * Stores information about a single hashed file. Once created, this structure
 * is immutable, so there's no need for locking when accessing it.
 *
 * See also `enclave_tf_structs.h` in current PAL.
 */
struct shim_hashed_file {
   const char* uri;
   uint64_t size;
   
   /* Hash of the whole file */
   file_hash_t file_hash;
   
   /* Hashes of individual chunks */
   chunk_hash_t* chunk_hashes;
}

/* 
 * Read `fs.hashed_files` into a lookup table (uthash?)
 */
int init_hashed_files(void);

/*
 * Initializes a hashed file:
 *
 * - verifies the hash of the entire file,
 * - computes chunk hashes,
 * - creates and populates the `shim_hashed_file` object.
 */
int new_hashed_file(const char* uri, PAL_HANDLE handle, struct shim_hashed_file** out_hf);

void delete_hashed_file(struct shim_hashed_file* hf);

/*
 * Reads data from file, verifying chunk hashes. `count` and `offset` should be aligned to `CHUNK_SIZE`.
 */
int read_hashed_file(struct shim_hashed_file* hf, PAL_HANDLE handle, buf, size_t count, file_off_t offset);

Note that this is somewhat inefficient:

  • on file lookup (e.g. stat), it has to read the whole file in order to know its size,
  • on file read, it reads a whole chunk of data, even if the amount requested by user is smaller.

However, AFAIK this matches the current implementation in PAL. We can add caching if it ever becomes an issue.

Implementation plan

  1. Implement the hashed filesystem as a separate feature (that looks at fs.hashed_files). Add support to gramine-manifest tool: note that it will compute the file hashes when building *.manifest, NOT *.manifest.sgx.

  2. Implement loader.entrypoint_sha256 for loading LibOS by PAL.

  3. Remove trusted files from PAL, route sgx.trusted_files etc. to the new code. Add deprecation notices.

Issues

  • The new design assumes sgx.file_check_policy = "allow_all_but_log" is not necessary. Is it?

  • The target implementation is, IMO, simple, but the compatibility code (during deprecation period) will be more complicated, in order to match the current behavior.

  • The chunk hashes are SHA256 truncated to 128 bits, presumably to save space. Maybe we should just use 256-bit chunk hashes? (If the host can find a collision for the chunk hash function, it can replace the hashes after Gramine verifies the file)

@dimakuv
Copy link

dimakuv commented May 11, 2022

The new design assumes sgx.file_check_policy = "allow_all_but_log" is not necessary. Is it?

This is necessary. People use it quite often.

@pwmarcz
Copy link
Contributor Author

pwmarcz commented May 11, 2022

I see. Do you think people use it for writable files as well?

@pwmarcz
Copy link
Contributor Author

pwmarcz commented May 11, 2022

Anyway, I'd suggest to add it as a mount parameter:

fs.mounts = [
   { type = "hashed", path = "/usr/lib", uri = "file:/usr/lib", policy = "allow_all_but_log" },
]

And, if we can afford it, this will treat unknown files as openable but still read-only.

@lejunzhu
Copy link
Contributor

I see. Do you think people use it for writable files as well?

I do use them a lot, especially when people want to quickly evaluate an existing app with Gramine and they can't give us the whole environment. I usually suggest them to use a very permissive manifest, let them run the app successfully in their environment. Then we'll go through the runtime log and come up with a more restrictive manifest for production.

Without the ability to read/write any file, the app will usually stop at the first violation, and after going back and forth several times, new beginners will fell Gramine is difficult to use, while it really isn't.

@dimakuv
Copy link

dimakuv commented May 12, 2022

And, if we can afford it, this will treat unknown files as openable but still read-only.

Yes, I think this is reasonable. People can always just use type = "chroot" which will become a pass-through with this proposal anyway. And when they are ready, they will change the type to the correct type = "hashed".

@lejunzhu
Copy link
Contributor

Yes, I think this is reasonable. People can always just use type = "chroot" which will become a pass-through with this proposal anyway. And when they are ready, they will change the type to the correct type = "hashed".

With normal passthrough, we won't be able to see which file is accessed from the runtime log. Perhaps we can also add policy = "allow_all_but_log" to passthrough mounts as well, to log first (read or write) access to each file?

@dimakuv
Copy link

dimakuv commented May 12, 2022

With normal passthrough, we won't be able to see which file is accessed from the runtime log. Perhaps we can also add policy = "allow_all_but_log" to passthrough mounts as well, to log first (read or write) access to each file?

That's a very good point. Ok, yes, maybe we should add this allow_all_bug_log to passthrough mounts as well.

@dimakuv
Copy link

dimakuv commented Jun 9, 2022

Just a note before I forget it again: after this redesign is done, and the documentation https://gramine.readthedocs.io/en/latest/manifest-syntax.html is updated, we must move the syntax descriptions from under SGX syntax to common syntax.

For example, currently Encrypted Files (https://gramine.readthedocs.io/en/latest/manifest-syntax.html#encrypted-files) are kept under "SGX syntax" section. This is not correct anymore -- they are applicable to any PAL (backend), so they should be under "Common syntax" section.

Same will be with Trusted Files after the redesign.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P: 2
Projects
None yet
Development

No branches or pull requests

3 participants