-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adds support for encryption #109
base: main
Are you sure you want to change the base?
Conversation
only for customer supplied key Signed-off-by: Bruno Calza <[email protected]>
@@ -116,6 +116,9 @@ struct BucketPutArgs { | |||
broadcast_mode: BroadcastMode, | |||
#[command(flatten)] | |||
tx_args: TxArgs, | |||
/// Encrypt/Decrypt objects using 256-bit encryption keys. Formats: RawBase64. | |||
#[arg(long)] | |||
enc_c: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new option to put
comand is added (--enc-c
). It's through this option that the client provides the encryption key. It must be a 256-bit base64-encoded key
@@ -171,6 +174,9 @@ struct BucketGetArgs { | |||
/// or a specific block height, e.g., "123". | |||
#[arg(long, value_parser = parse_query_height, default_value = "committed")] | |||
height: FvmQueryHeight, | |||
/// Encrypt/Decrypt objects using 256-bit encryption keys. Formats: RawBase64. | |||
#[arg(long)] | |||
enc_c: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new option to get
comand is added (--enc-c
). It's through this option that the client provides the encryption key to decrypt the object. It must be the same used for encryption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--enc-c
sounds a bit confusing name for an argument to me. is that a convention for encryption CLIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just using the same name used by minio's client. I don't think there's a convention, aws cli uses --sse-c
and google cloud cli uses --encryption-key
} | ||
|
||
#[pin_project] | ||
pub struct DecryptWriter<W: AsyncWrite + Unpin> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This object is probably very hard to understand. And it took me a while to get it right (mostly because of my lack of understanding of Pin).
In essence, it decorates a writer
that implements AsyncWrite
. The idea is that as the data is written to this writer
we intercept it, decrypt it, and fill the inner writer.
This is the function that does the wrapping logic.
You can pass a Filter
to this object in case of a Range query. So, if you want to apply a range query on encrypted data you use with_filter
, if not you use new
The implementation follows a state machine with the following states:
ReadingHeader
: the state where we need to grab the next 16B of the payload and decode the headerDecrypting
: the state where grab the rest of the package 65 KiB (ciphertext) + 16B (tag), decrypt it and fill thedecrypted
bufferWriting
: the state where we asynchronously write to the inner writer`
The Filter
struct means:
offset
: any byte less than offset should be ignored and not written to the innerlength
: How many bytes should we include from theoffset
? Anything more thanoffset
+length
should be ignoredconsumed
: This is just to indicate how many bytes we have already consumed (or peeked, or considered) from the beginning
} | ||
} | ||
|
||
fn filter_bytes<'a>(&mut self, plaintext: &'a [u8]) -> &'a [u8] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of a complicated method, but in essence, we are trying to figure out how much of the plaintext
chunk (decrypted content chunk) falls within offset
+ length
.
consumed
is used as a pointer (not in a programming language sense), and we do math based if we have reached offset
or if we have passed offset
+ length
. We use remaining
to indicate how many bytes we still need to write. After remaining
reaches 0, we ignore everything
I would review this last, also take a look at the test
.map_err(|err| Error::new(ErrorKind::InvalidData, err.to_string()))?; | ||
|
||
#[allow(clippy::drop_non_drop)] | ||
drop(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is ugly, but I cannot call the method filter_bytes
on this
or self
for some reason. The borrow checker complains. There's something about it I still don't understand, so I came up with this hack.
self.domain.clone() | ||
} | ||
|
||
pub fn unseal(&self, kek: String, object_path: &str) -> anyhow::Result<ObjectKey> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how we decrypt ObjectKey
fn size_decrypted(&self) -> u64; | ||
} | ||
|
||
impl EncryptedObjectExt for Object { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're extending Hoku Object to add some functionality:
- Is this object encrypted?
- Which kind of encryption was used (client or kms) note: kms is not really implemented
- Give me the sealed object key
- Give me the size of the decrypted object
This makes the client code cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice
let iv = sealed_object_key.iv_as_string()?; | ||
let algorithm = sealed_object_key.algorithm(); | ||
|
||
let mut metadata = HashMap::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to store these metadata for decryption
use std::collections::HashMap; | ||
use tokio::io::AsyncRead; | ||
|
||
pub fn encrypt_reader<R: AsyncRead>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the encryption logic
- derives object key
- wraps reader
- encrypts object key
- stores metadata
end: Option<u64>, | ||
} | ||
|
||
impl Range { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This object is responsible for range logic
For an unencrypted object, there's no logic. We pass to Hoku whatever the user sent. For encrypted, we need to transform to a new range because encrpyted content is bigger, and also i need to retrieve the entire package to decrypt even if part of it is not withing the range. Check this test to understand this
} | ||
// Read the first 10 bytes of your downloaded 100 bytes | ||
let mut read_file = tokio::fs::File::open(&obj_path).await?; | ||
let mut contents = vec![0; 10]; | ||
read_file.read(&mut contents).await?; | ||
let _ = read_file.read(&mut contents).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may not need let _
, if you just unwrap
here. probably okay to unwrap
in the example code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow @brunocalza. This is a serious PR. I've done two passes now. I'm not 100% confident I understand all that is going on, but it looks correct from a high-level. I'm marking approved, so as not to slow things down. But I will likely want to do another pass over this to double-check nonce use etc.
} | ||
|
||
// Encrypt the chunk and store it in the buffer | ||
// TODO: handle error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to do this before initial merge? Or can we wait? It seems reasonable to panic here... but you might just have to wrap this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better handling it now
fn size_decrypted(&self) -> u64; | ||
} | ||
|
||
impl EncryptedObjectExt for Object { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice
Who do we need to do this for @brunocalza? |
@brunocalza: added the deploy key and secret in the repos. the build is still failing due to some lint issue but the keys should work now: #124 |
Add deploy key for DARE repo
holding this off for a bit until we have more clarity when we want to land this feature |
Summary
This PR adds support for encrypting files with encrypting key provided by the client
Closes #110
@carsonfarmer marking you as a reviewer. But maybe you can review more from a high-level protocol/cryptography/correctness perspective?
@dtbuchholz marking you as a reviewer. But maybe more from a CLI/SDK methods DX perspective?
@avichalp marking you as reviewer. Maybe more from code and Rust perspective? You already gave a lot of input in the cryptographic stuff
cc @andrewxhill @sanderpick @rohhan @oed
Pre-requisites for reviewing
Implementation overview
At a high level this is everything implemented in this PR:
add_from_path
andadd_reader
methodsget
methodReview orientation
sdk/src/machine/bucket.rs
, which is the crux of the SDK methods and we how they were altered to support encryptionAdditional notes
dare
repo. Can someone set those deploys keys for me?