-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port securesystemslib.hash module #2815
Conversation
369642b
to
7ac9f23
Compare
tuf/api/_payload.py
Outdated
@@ -45,6 +48,31 @@ | |||
T = TypeVar("T", "Root", "Timestamp", "Snapshot", "Targets") | |||
|
|||
|
|||
def _hash(algo: str) -> Any: # noqa: ANN401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python's hashlib does not make the Hash
type public, hence I'm using Any
and ask the linter for forgiveness.
tests/repository_simulator.py
Outdated
@@ -80,6 +80,8 @@ | |||
|
|||
SPEC_VER = ".".join(SPECIFICATION_VERSION) | |||
|
|||
_DEFAULT_HASH_ALGORITHM = "sha256" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's okay to make this a local default, and not re-use the one newly added to _payload.py, so that we can keep the latter internal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I'd be fine with just a magic value as well: it's not really even a default, it's just what this repository decided to use
securesystemslib.hash is a small wrapper around hashlib, which serves two main purposes: * provide helper function to hash a file * translate custom hash algorithm name "blake2b-256" to "blake2b" with (digest_size=32). In preparation for the removal of securesystemslib.hash, this patch ports above behavior to tuf and uses the builtin hashlib directly where possible. related secure-systems-lab/securesystemslib#943 Signed-off-by: Lukas Puehringer <[email protected]>
7ac9f23
to
866409f
Compare
tuf/api/_payload.py
Outdated
"""Returns hashed file.""" | ||
f.seek(0) | ||
if sys.version_info >= (3, 11): | ||
digest = hashlib.file_digest(f, lambda: _hash(algo)) # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mypy does not allow IO[bytes]
for f here. This may be related to python/typing#659. Ignore seems fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine to me. Left some comments
tuf/api/_payload.py
Outdated
digest_object = _hash(algorithm) | ||
digest_object.update(data) | ||
else: | ||
digest_object = sslib_hash.digest_fileobject( | ||
data, algorithm | ||
) | ||
except ( | ||
sslib_exceptions.UnsupportedAlgorithmError, | ||
sslib_exceptions.FormatError, | ||
) as e: | ||
digest_object = _file_hash(data, algorithm) | ||
except (ValueError, TypeError) as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the different functionality is a little annoying here (you need to call update on one digest_object but not the other)...
- If the
_hash
and_file_hash
functions directly returned the hexdigest this code would be simpler and the functions would be properly annotated - on the other hand then _hash() would need a data argument (meaning you would have to split the lambda needed by hashlib.file_digest() into another function)
I don't really know what is the simplest... up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does 535a189 look better? I don't have very strong feelings about either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No passionate argument here either. I think I like this new version better.
tests/repository_simulator.py
Outdated
@@ -80,6 +80,8 @@ | |||
|
|||
SPEC_VER = ".".join(SPECIFICATION_VERSION) | |||
|
|||
_DEFAULT_HASH_ALGORITHM = "sha256" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I'd be fine with just a magic value as well: it's not really even a default, it's just what this repository decided to use
coveralls.io is down again so I'm not sure... but I'm guessing this is the "blake2b-256" special case -- otherwise we're probably exercising this code. |
Remove the "default" prefix, because it's not a default but rather a fixed value. Signed-off-by: Lukas Puehringer <[email protected]>
Consolidate interface of bytes hash and file hash helpers. Signed-off-by: Lukas Puehringer <[email protected]>
Yep, blake is the culprit. Let me add a test. |
Signed-off-by: Lukas Puehringer <[email protected]>
@jku, want to re-approve after my latest changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. The only nit I have left is that multiple docstrings still refer to "securesystemslib default hash algorithm".
Default hash sha256 is now defined locally. Signed-off-by: Lukas Puehringer <[email protected]>
Good eyes! I pushed a fix and will merge when ci returns. |
fixes secure-systems-lab#943 * Internal use does not need the additional features (custom blake algorithm name support and file hashing), and was replaced by direct calls to hashlib. * External users were updated to no longer require `securesystemslib.hash` (theupdateframework/python-tuf#2815, in-toto/in-toto#861) Signed-off-by: Lukas Puehringer <[email protected]>
fixes secure-systems-lab#943 * Internally, the features added by the module (i.e. custom blake name support, and file hashing) are not needed. Builtin hashlib is used now directly. * External users port the relevant features into their code base: theupdateframework/python-tuf#2815 in-toto/in-toto#861 Signed-off-by: Lukas Puehringer <[email protected]>
securesystemslib.hash is a small wrapper around hashlib, which serves two main purposes:
In preparation for the removal of securesystemslib.hash, this patch ports above behavior to tuf and uses the builtin hashlib directly where possible.
related secure-systems-lab/securesystemslib#943