-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SNP-style virtual attestations, restoring code update tests #6770
base: main
Are you sure you want to change the base?
Conversation
@reqs.description("Test quotes") | ||
@reqs.supports_methods("/node/quotes/self", "/node/quotes") | ||
def test_quote(network, args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is deleted. It's almost exactly the same as verify_quotes
in code_update
. The only other thing it does is check that the /node/quotes/self
calls match entries from the single /node/quotes
list. That's now added to verify_quotes
.
enclave_type, enclave_platform, oe_binary_dir, package, library_dir="." | ||
): | ||
def get_measurement(enclave_type, enclave_platform, package, library_dir="."): | ||
lib_path = infra.path.build_lib_path( | ||
package, enclave_type, enclave_platform, library_dir | ||
) | ||
|
||
if enclave_platform == "sgx": | ||
res = subprocess.run( | ||
[os.path.join(oe_binary_dir, "oesign"), "dump", "-e", lib_path], | ||
capture_output=True, | ||
check=True, | ||
) | ||
lines = [ | ||
line | ||
for line in res.stdout.decode().split(os.linesep) | ||
if line.startswith("mrenclave=") | ||
] | ||
if enclave_platform == "virtual": | ||
hash = sha256(open(lib_path, "rb").read()) | ||
return hash.hexdigest() | ||
|
||
return lines[0].split("=")[1] | ||
else: | ||
# Virtual and SNP | ||
return hashlib.sha256(lib_path.encode()).hexdigest() | ||
raise ValueError(f"Cannot get measurement on {enclave_platform}") | ||
|
||
|
||
def get_host_data_and_security_policy(enclave_platform): | ||
DEFAULT_VIRTUAL_SECURITY_POLICY = "Default CCF virtual security policy" | ||
if enclave_platform == "snp": | ||
security_policy = snp.get_container_group_security_policy() | ||
elif enclave_platform == "virtual": | ||
security_policy = DEFAULT_VIRTUAL_SECURITY_POLICY | ||
else: | ||
raise ValueError(f"Cannot get security policy on {enclave_platform}") | ||
host_data = sha256(security_policy.encode()).hexdigest() | ||
return host_data, security_policy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving away from any ability to generate/check SGX attestations, updating terminology to "measurement" rather than "code_id". We get host data and security policies on the client for SNP (because we're actually the same box...), but don't yet get the measurement - I think we could now do that from Python too, and get closer to an SNP code update story, but it's outside the scope of this PR.
# Measurements | ||
test_measurements_tables(network, args) | ||
test_add_node_with_bad_code(network, args) | ||
|
||
# Host data/security policy | ||
test_host_data_tables(network, args) | ||
test_add_node_with_bad_host_data(network, args) | ||
test_add_node_with_stubbed_security_policy(network, args) | ||
|
||
if snp.IS_SNP: | ||
test_snp_measurements_tables(network, args) | ||
test_add_node_with_no_uvm_endorsements(network, args) | ||
test_host_data_table(network, args) | ||
test_add_node_without_security_policy(network, args) | ||
test_add_node_remove_trusted_security_policy(network, args) | ||
test_start_node_with_mismatched_host_data(network, args) | ||
test_add_node_with_bad_host_data(network, args) | ||
test_add_node_with_bad_code(network, args) | ||
# NB: Assumes the current nodes are still using args.package, so must run before test_proposal_invalidation | ||
test_add_node_with_bad_security_policy(network, args) | ||
|
||
# Endorsements | ||
if snp.IS_SNP: | ||
test_endorsements_tables(network, args) | ||
test_add_node_with_no_uvm_endorsements(network, args) | ||
|
||
# NB: Assumes the current nodes are still using args.package, so must run before test_update_all_nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of renamed tests, but hopefully the correspondence to the old ones is relatively clear. We now have fewer SNP-only tests, because they have a sane Virtual implementation.
test_add_node_without_security_policy(network, args) | ||
test_add_node_remove_trusted_security_policy(network, args) | ||
test_start_node_with_mismatched_host_data(network, args) | ||
test_add_node_with_bad_host_data(network, args) | ||
test_add_node_with_bad_code(network, args) | ||
# NB: Assumes the current nodes are still using args.package, so must run before test_proposal_invalidation | ||
test_add_node_with_bad_security_policy(network, args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm humming and hahhing about supporting these, but will likely look at it if the PR lives for a while. We don't currently have a way to dynamically set a different security policy for Virtual, and doing so would require some kind of plumbing (probably an env var or file it reads? But still needs to dive through several layers of our infra). The SNP model both has all of this plumbing (with snp_
specific names), but ignores it for these tests and just fiddles with some files in the security-context directory. Should we duplicate that for virtual? I vote no.
def add_measurement(self, remote_node, platform, measurement): | ||
if platform == "sgx": | ||
return self.add_new_code(remote_node, measurement) | ||
elif platform == "virtual": | ||
return self.add_virtual_measurement(remote_node, measurement) | ||
elif platform == "snp": | ||
return self.add_snp_measurement(remote_node, measurement) | ||
else: | ||
raise ValueError(f"Unsupported platform {platform}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These wrapper functions could live in code_update.py
, but are plausibly useful for other tests. At the platform-agnostic test code, you want to "update a measurement", but under the hood that has to call a specific governance function to write to a platform-specific table. Other options are available, this is even more copy-paste code, but I think it's "fine".
if new_host_data is not None: | ||
old_host_data, old_security_policy = ( | ||
infra.utils.get_host_data_and_security_policy(args.enclave_platform) | ||
) | ||
|
||
if old_host_data != new_host_data: | ||
network.consortium.remove_host_data( | ||
primary, args.enclave_platform, old_host_data | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is hopefully predicting what will be necessary if/when a version-transition includes a host-data update. Currently it doesn't, on either SNP or Virtual. But if it does, we will (probably?) want to cycle it like we cycle measurements (I hope?).
}); | ||
virtual_policy["hostData"] = virtual_host_data; | ||
|
||
response_body["virtual"] = virtual_policy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the first big additions to the new governance API since we dropped the TypeSpec, meaning it's currently undocumented. We could try manually patching the generated OpenAPI, but I think that's rubbish! We could restore auto-generated descriptions for these endpoints (currently all hidden), but we deliberately don't rely on the magic auto-serialisers for this, so it wouldn't really help. Gah.
We previously had a vestigial virtual attestation reusing some of the terminology and fields of SGX attestations. This didn't provide any distinctions between nodes or apply checks during node joining, so wasn't usefully testing code upgrade flows.
This has been replaced with a new scheme based on SNP attestations. A virtual node now has a measurement (the sha256 of the enclave library, calculated by the host at startup) and a host data/security policy value (currently a single default string for security policy, with host data the sha256 of that as it is for SNP). This introduces many duplicated tables, and associated duplicated governance, because we don't want to risk collisions across platforms.
The beneficial outcome is that we can now test code update flows (ie - change the "permitted nodes" of a service at run-time, confirm that old nodes can no longer join) close to how they run on SNP. We can also test some of the effects of fiddling with these tables (eg - omitting security policies, setting invalid host data) outside of SNP, though there's the caveat that these are all touching separate governance actions and tables.
There's no endorsements for virtual attestations, to avoid creating/maintaining any fake hardware keys, but this means there are still join paths on SNP that virtual doesn't test. I've tried to avoid too many renames/refactors of existing fields, but the existing PAL is extremely porous and inconsistent, so some of the names/concepts are unclear (ie - "host_data" is an SNP concept, "security_policy" is what we/ACI put there, but the names aren't consistently split and the digesting/decoding is haphazard).
I'll add some comments describing the changes I remember, when it's not last-thing-on-a-Friday.