Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS,common] Add file recovery support for encrypted files #2082

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions Documentation/devel/encfiles.rst
Original file line number Diff line number Diff line change
Expand Up @@ -508,10 +508,19 @@ Additional details
least one process writes to the file), the file may become corrupted or
inaccessible to one of the processes.

- There is no support for file recovery. If the file was only partially written
to storage when the app abruptly terminated, Gramine will treat this file as
corrupted and will return an ``-EACCES`` error. (This is in contrast to Intel
SGX SDK which supports file recovery.)
- File recovery: Gramine supports recovery for encrypted files, which can be
enabled via the ``enable_recovery`` mount parameter in the Gramine manifest.
This allows a file to be recovered from a corrupted state (caused by e.g.,
incorrect GMACs and/or encryption keys) when it was only partially written to
storage due to a fatal error (e.g., abrupt app termination). Similar to Intel
SGX SDK’s recovery mechanism, Gramine uses a "shadow" recovery file and a
``has_pending_write`` flag in the metadata node to manage write transactions.
During file flush, cached blocks about to change are saved to the recovery
file. If an encrypted file is opened with the flag set, a recovery process
reverts partial changes using the recovery file, restoring the last known good
state. The "shadow" recovery file is cleaned up on file close. Note that
enabling this feature can impact performance due to additional writes to the
shadow file on each flush.

- There is no key rotation scheme. The application must perform key rotation of
the KDK by itself (by overwriting the ``/dev/attestation/keys/``
Expand Down
1,074 changes: 1,073 additions & 1 deletion Documentation/img/encfiles/02_encfiles_representation.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,244 changes: 1,243 additions & 1 deletion Documentation/img/encfiles/04_encfiles_write_less3k.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,242 changes: 1,241 additions & 1 deletion Documentation/img/encfiles/05_encfiles_read_less3k.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,357 changes: 1,356 additions & 1 deletion Documentation/img/encfiles/06_encfiles_write_greater3k.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,321 changes: 1,320 additions & 1 deletion Documentation/img/encfiles/08_encfiles_read_greater3k.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 7 additions & 1 deletion Documentation/manifest-syntax.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1088,7 +1088,7 @@ Encrypted files
::

fs.mounts = [
{ type = "encrypted", path = "[PATH]", uri = "[URI]", key_name = "[KEY_NAME]" },
{ type = "encrypted", path = "[PATH]", uri = "[URI]", key_name = "[KEY_NAME]", enable_recovery = [true|false] },
]

fs.insecure__keys.[KEY_NAME] = "[32-character hex value]"
Expand Down Expand Up @@ -1154,6 +1154,12 @@ Gramine:
in the application is insecure. If you need to derive encryption keys from
such a "doubly-used" key, you must apply a KDF.

The ``enable_recovery`` mount parameter (default: ``false``) determines whether
file recovery is enabled for the mount. This feature allows selective enabling
or disabling of recovery for different mounted files or directories. Note that
enabling this feature can negatively impact performance, as it writes to a
second shadow file for later recovery purposes on each flush.

.. _untrusted-shared-memory:

Untrusted shared memory
Expand Down
209 changes: 195 additions & 14 deletions common/src/protected_files/protected_files.c
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ static const char* g_pf_error_list[] = {
[-PF_STATUS_NOT_IMPLEMENTED] = "Functionality not implemented",
[-PF_STATUS_CALLBACK_FAILED] = "Callback failed",
[-PF_STATUS_PATH_TOO_LONG] = "Path is too long",
[-PF_STATUS_RECOVERY_NEEDED] = "File recovery needed",
[-PF_STATUS_RECOVERY_NEEDED] = "File recovery needed but failed",
[-PF_STATUS_FLUSH_ERROR] = "Flush error",
[-PF_STATUS_CRYPTO_ERROR] = "Crypto error",
[-PF_STATUS_CORRUPTED] = "File is corrupted",
Expand Down Expand Up @@ -429,36 +429,134 @@ static bool ipf_update_metadata_node(pf_context_t* pf) {
return true;
}

static bool ipf_write_recovery_node(pf_context_t* pf, uint64_t physical_node_number,
const void* buffer, uint64_t offset) {
assert(pf->host_recovery_file_handle);

recovery_node_t recovery_node = { .physical_node_number = physical_node_number };
memcpy(recovery_node.bytes, buffer, sizeof(recovery_node.bytes));

pf_status_t status = g_cb_write(pf->host_recovery_file_handle, (void*)&recovery_node, offset,
sizeof(recovery_node));
if (PF_FAILURE(status)) {
pf->last_error = status;
return false;
}

return true;
}

static bool ipf_dump_dirty_cache_to_recovery_file(pf_context_t* pf) {
assert(pf->host_recovery_file_handle);

pf_status_t status = g_cb_truncate(pf->host_recovery_file_handle, 0);
if (PF_FAILURE(status)) {
pf->last_error = status;
return false;
}

void* node;
uint64_t offset = 0;
for (node = lruc_get_first(pf->cache); node != NULL; node = lruc_get_next(pf->cache)) {
file_node_t* file_node = (file_node_t*)node;
if (!file_node->need_writing)
continue;

if (!ipf_write_recovery_node(pf, file_node->physical_node_number, &file_node->encrypted,
offset))
return false;

offset += sizeof(recovery_node_t);
}

if (!ipf_write_recovery_node(pf, /*physical_node_number=*/1, &pf->root_mht_node.encrypted,
offset))
return false;

offset += sizeof(recovery_node_t);

if (!ipf_write_recovery_node(pf, /*physical_node_number=*/0, &pf->metadata_node, offset))
return false;

return true;
}

static bool ipf_set_pending_write(pf_context_t* pf) {
pf->metadata_node.plaintext_part.has_pending_write = 1;
bool ret = ipf_write_node(pf, /*physical_node_number=*/0, &pf->metadata_node);

/* Unset the `has_pending_write` flag in memory, which will be cleared on disk at the end of the
* flush when we write the metadata to disk. */
pf->metadata_node.plaintext_part.has_pending_write = 0;

return ret;
}

static bool ipf_clear_pending_write(pf_context_t* pf) {
assert(pf->metadata_node.plaintext_part.has_pending_write == 0);

if (!ipf_write_node(pf, /*physical_node_number=*/0, &pf->metadata_node))
return false;

pf_status_t status = g_cb_fsync(pf->host_file_handle);
if (PF_FAILURE(status)) {
pf->last_error = status;
return false;
}

return true;
}

static bool ipf_internal_flush(pf_context_t* pf) {
if (!pf->need_writing) {
DEBUG_PF("no need to write");
return true;
}

if (pf->metadata_decrypted.file_size > MD_USER_DATA_SIZE && pf->root_mht_node.need_writing) {
if (pf->host_recovery_file_handle) {
if (!ipf_dump_dirty_cache_to_recovery_file(pf)) {
pf->file_status = PF_STATUS_FLUSH_ERROR;
DEBUG_PF("failed to write changes to the recovery file");
goto recoverable_error;
}

if (!ipf_set_pending_write(pf)) {
pf->file_status = PF_STATUS_FLUSH_ERROR;
DEBUG_PF("failed to set the pending write flag");
goto recoverable_error;
}
}

if (!ipf_update_all_data_and_mht_nodes(pf)) {
// this is something that shouldn't happen, can't fix this...
pf->file_status = PF_STATUS_CRYPTO_ERROR;
DEBUG_PF("failed to update data and MHT nodes");
return false;
goto unrecoverable_error;
}
}

if (!ipf_update_metadata_node(pf)) {
// this is something that shouldn't happen, can't fix this...
pf->file_status = PF_STATUS_CRYPTO_ERROR;
DEBUG_PF("failed to update metadata node");
return false;
goto unrecoverable_error;
}

if (!ipf_write_all_changes_to_disk(pf)) {
pf->file_status = PF_STATUS_WRITE_TO_DISK_FAILED;
DEBUG_PF("failed to write changes to disk");
return false;
goto recoverable_error;
}

pf->need_writing = false;
return true;

unrecoverable_error:
if (pf->host_recovery_file_handle)
(void)ipf_clear_pending_write(pf);
recoverable_error:
return false;
}

static file_node_t* ipf_get_mht_node(pf_context_t* pf, uint64_t offset) {
Expand Down Expand Up @@ -750,16 +848,77 @@ static bool ipf_init_fields(pf_context_t* pf) {

ipf_init_root_mht(&pf->root_mht_node);

pf->host_file_handle = NULL;
pf->need_writing = false;
pf->file_status = PF_STATUS_UNINITIALIZED;
pf->last_error = PF_STATUS_SUCCESS;
pf->host_file_handle = NULL;
pf->host_recovery_file_handle = NULL;
pf->need_writing = false;
pf->file_status = PF_STATUS_UNINITIALIZED;
pf->last_error = PF_STATUS_SUCCESS;

pf->cache = lruc_create();
return true;
}

static bool ipf_init_existing_file(pf_context_t* pf, const char* path) {
/* Reads each recovery node from the recovery file and apply the embedded pf node
* (recovery_node.bytes) to the corresponding offset (recovery_node.physical_node_number) in the
* main file. After applying all nodes, re-checks the metadata node to ensure no pending writes. */
static bool ipf_recover(pf_context_t* pf, uint64_t recovery_file_size) {
pf_status_t status;

if (!pf->host_recovery_file_handle) {
DEBUG_PF("file recovery needed but recovery file handle not set; please consider setting "
"'enable_recovery = true' for the mount");
pf->last_error = PF_STATUS_RECOVERY_NEEDED;
return false;
}

if (recovery_file_size == 0 || recovery_file_size % sizeof(recovery_node_t) != 0) {
DEBUG_PF("recovery file size is not right [%lu]", recovery_file_size);
pf->last_error = PF_STATUS_RECOVERY_NEEDED;
return false;
}

uint64_t recovery_nodes_count = recovery_file_size / sizeof(recovery_node_t);

for (uint64_t i = 0; i < recovery_nodes_count; i++) {
recovery_node_t recovery_node;

status = g_cb_read(pf->host_recovery_file_handle, &recovery_node,
i * sizeof(recovery_node_t), sizeof(recovery_node_t));
if (PF_FAILURE(status)) {
pf->last_error = status;
return false;
}

uint64_t untrusted_offset = recovery_node.physical_node_number;
status = g_cb_write(pf->host_file_handle, recovery_node.bytes,
untrusted_offset * sizeof(recovery_node.bytes),
sizeof(recovery_node.bytes));
if (PF_FAILURE(status)) {
pf->last_error = status;
return false;
}
}

status = g_cb_fsync(pf->host_file_handle);
if (PF_FAILURE(status)) {
pf->last_error = status;
return false;
}

/* re-check after recovery */
if (!ipf_read_node(pf, /*physical_node_number=*/0, (uint8_t*)&pf->metadata_node))
return false;

if (pf->metadata_node.plaintext_part.has_pending_write == 1) {
pf->last_error = PF_STATUS_RECOVERY_NEEDED;
return false;
}

return true;
}

static bool ipf_init_existing_file(pf_context_t* pf, const char* path, uint64_t recovery_file_size,
bool try_recover) {
pf_status_t status;

// read metadata node
Expand All @@ -778,6 +937,20 @@ static bool ipf_init_existing_file(pf_context_t* pf, const char* path) {
return false;
}

if (try_recover && pf->metadata_node.plaintext_part.has_pending_write == 1) {
DEBUG_PF("%s: starting file recovery", path);

if (!ipf_recover(pf, recovery_file_size)) {
DEBUG_PF("%s: file recovery attempted but failed", path);
return false;
}

DEBUG_PF("%s: file recovery completed", path);
}

/* Ensure the `has_pending_write` flag is cleared in the in-memory copy. */
pf->metadata_node.plaintext_part.has_pending_write = 0;

pf_key_t key;
if (!ipf_recreate_metadata_key(pf, &key))
return false;
Expand Down Expand Up @@ -852,7 +1025,9 @@ static void ipf_try_clear_error(pf_context_t* pf) {
}

static pf_context_t* ipf_open(const char* path, pf_file_mode_t mode, bool create, pf_handle_t file,
uint64_t real_size, const pf_key_t* kdk_key, pf_status_t* status) {
uint64_t real_size, const pf_key_t* kdk_key,
pf_handle_t recovery_file_handle, uint64_t recovery_file_size,
bool try_recover, pf_status_t* status) {
*status = PF_STATUS_NO_MEMORY;
pf_context_t* pf = calloc(1, sizeof(*pf));

Expand Down Expand Up @@ -892,10 +1067,11 @@ static pf_context_t* ipf_open(const char* path, pf_file_mode_t mode, bool create
pf->host_file_handle = file;
pf->mode = mode;

pf->host_recovery_file_handle = recovery_file_handle;

if (!create) {
if (!ipf_init_existing_file(pf, path))
if (!ipf_init_existing_file(pf, path, recovery_file_size, try_recover))
goto out;

} else {
if (!ipf_init_new_file(pf, path))
goto out;
Expand Down Expand Up @@ -1126,12 +1302,17 @@ void pf_set_callbacks(pf_read_f read_f, pf_write_f write_f, pf_fsync_f fsync_f,
}

pf_status_t pf_open(pf_handle_t handle, const char* path, uint64_t underlying_size,
pf_file_mode_t mode, bool create, const pf_key_t* key, pf_context_t** context) {
pf_file_mode_t mode, bool create, const pf_key_t* key,
pf_handle_t recovery_file_handle, uint64_t recovery_file_size,
bool try_recover, pf_context_t** context) {
assert((recovery_file_handle != NULL) || (recovery_file_size == 0));

if (!g_initialized)
return PF_STATUS_UNINITIALIZED;

pf_status_t status;
*context = ipf_open(path, mode, create, handle, underlying_size, key, &status);
*context = ipf_open(path, mode, create, handle, underlying_size, key, recovery_file_handle,
recovery_file_size, try_recover, &status);
return status;
}

Expand Down
28 changes: 17 additions & 11 deletions common/src/protected_files/protected_files.h
Original file line number Diff line number Diff line change
Expand Up @@ -210,21 +210,27 @@ void pf_set_callbacks(pf_read_f read_f, pf_write_f write_f, pf_fsync_f fsync_f,
const char* pf_strerror(int err);

/*!
* \brief Open a protected file.
*
* \param handle Open underlying file handle.
* \param path Path to the file. If NULL and \p create is false, don't check path
* for validity.
* \param underlying_size Underlying file size.
* \param mode Access mode.
* \param create Overwrite file contents if true.
* \param key Wrap key.
* \param[out] context PF context for later calls.
* \brief Open a protected file, with optional recovery check and process.
*
* \param handle Open underlying file handle.
* \param path Path to the file. If NULL and \p create is false, don't check path
* for validity.
* \param underlying_size Underlying file size.
* \param mode Access mode.
* \param create Overwrite file contents if true.
* \param key Wrap key.
* \param recovery_file_handle (optional) Underlying recovery file handle.
* \param recovery_file_size (optional) Recovery file size. Must be 0 if
* \p recovery_file_handle is not set.
* \param try_recover Whether to check for and perform file recovery if needed.
* \param[out] context PF context for later calls.
*
* \returns PF status.
*/
pf_status_t pf_open(pf_handle_t handle, const char* path, uint64_t underlying_size,
pf_file_mode_t mode, bool create, const pf_key_t* key, pf_context_t** context);
pf_file_mode_t mode, bool create, const pf_key_t* key,
pf_handle_t recovery_file_handle, uint64_t recovery_file_size,
bool try_recover, pf_context_t** context);

/*!
* \brief Close a protected file and commit all changes to disk.
Expand Down
Loading