|
| 1 | +* TAP: 16 |
| 2 | +* Title: Snapshot Merkle Trees |
| 3 | +* Version: 0 |
| 4 | +* Last-Modified: 22/01/2021 |
| 5 | +* Author: Marina Moore, Justin Cappos |
| 6 | +* Type: Standardization |
| 7 | +* Status: Draft |
| 8 | +* Content-Type: markdown |
| 9 | +* Created: 14/09/2020 |
| 10 | +* +TUF-Version: |
| 11 | +* +Post-History: |
| 12 | + |
| 13 | +# Abstract |
| 14 | + |
| 15 | +Snapshot metadata for repositories with a high number of targets |
| 16 | +metadata files (through significant use of delegations), can become |
| 17 | +prohibitively large. Due to the need to download the snapshot file on every |
| 18 | +update cycle, very large snapshot metadata files can become a significant |
| 19 | +overhead for TUF clients. |
| 20 | + |
| 21 | +This TAP proposes a method for reducing the size of snapshot metadata a client |
| 22 | +must download without significantly weakening the security properties of TUF. |
| 23 | + |
| 24 | + |
| 25 | +# Motivation |
| 26 | + |
| 27 | +For very large repositories, the snapshot metadata file could get very large. |
| 28 | +This snapshot metadata file must be downloaded on every update cycle, and so |
| 29 | +could significantly impact the metadata overhead. For example, if a repository |
| 30 | +has 50,000,000 targets metadata files, the snapshot metadata will be about |
| 31 | +380,000,000 bytes (https://docs.google.com/spreadsheets/d/18iwWnWvAAZ4In33EWJBgdAWVFE720B_z0eQlB4FpjNc/edit?ts=5ed7d6f4#gid=0). |
| 32 | +For this reason, it is necessary to create a more scalable solution for snapshot |
| 33 | +metadata that does not significantly impact the security properties of TUF. |
| 34 | + |
| 35 | +We designed a new approach to snapshot that improves scalability while |
| 36 | +achieving similar security properties to the existing snapshot metadata. |
| 37 | +Using this new approach, a repository with 50,000,000 targets metadata files |
| 38 | +would only require the user to download about 800 bytes of snapshot metadata (https://docs.google.com/spreadsheets/d/18iwWnWvAAZ4In33EWJBgdAWVFE720B_z0eQlB4FpjNc/edit?ts=5ed7d6f4#gid=924553486). |
| 39 | + |
| 40 | + |
| 41 | +# Rationale |
| 42 | + |
| 43 | +Snapshot metadata provides a consistent view of the repository in order to |
| 44 | +protect against mix-and-match attacks and rollback attacks. In order to provide |
| 45 | +these protections, snapshot metadata is responsible for keeping track of the |
| 46 | +version number of each targets metadata file, ensuring that all targets downloaded are |
| 47 | +from the same snapshot, and ensuring that no targets metadata file decreases its version |
| 48 | +number (except in the case of fast forward attack recovery). Any new solution |
| 49 | +we develop must provide these same protections. |
| 50 | + |
| 51 | +A snapshot Merkle tree manages version information for each targets metadata |
| 52 | +file by including this information in a leaf node for each targets metadata |
| 53 | +file. By using a Merkle tree to store these nodes, |
| 54 | +this proposal can cryptographically verify that different targets are from the |
| 55 | +same snapshot by ensuring that the Merkle tree roots match. Due to the |
| 56 | +properties of secure hash functions, any two leaves of a Merkle tree with the |
| 57 | +same root are from the same tree. |
| 58 | + |
| 59 | +In order to prevent rollback attacks between Merkle trees, this proposal |
| 60 | +introduces third-party auditors. These auditors are responsible for downloading |
| 61 | +all nodes of each Merkle tree to ensure that no version numbers have decreased |
| 62 | +between generated trees. This achieves rollback protection without every client |
| 63 | +having to store the version information for every targets metadata file. |
| 64 | + |
| 65 | +# Specification |
| 66 | + |
| 67 | +This proposal replaces the single snapshot metadata file with a snapshot Merkle |
| 68 | +metadata file for each targets metadata file. The repository generates these |
| 69 | +snapshot Merkle metadata files by building a Merkle tree using all targets |
| 70 | +metadata files and storing the path to each targets metadata file in the |
| 71 | +snapshot Merkle metadata. The root of this Merkle tree is stored in timestamp |
| 72 | +metadata to allow for client verification. The client uses the path stored in |
| 73 | +the snapshot Merkle metadata for a targets metadata file, along |
| 74 | +with the root of the Merkle tree, to ensure that metadata is from the given |
| 75 | +Merkle tree. The details of these files and procedures are described in |
| 76 | +this section. |
| 77 | + |
| 78 | + |
| 79 | + |
| 80 | +## Merkle tree generation |
| 81 | + |
| 82 | +When the repository generates snapshot metadata, instead of putting the version |
| 83 | +information for all targets metadata files into a single file, it instead uses the version |
| 84 | +information to generate a Merkle tree. Each targets metadata file's version information forms |
| 85 | +a leaf of the tree, then these leaves are used to build a Merkle tree. The |
| 86 | +internal nodes of a Merkle tree contain the hash of their child nodes. The exact |
| 87 | +algorithm for generating this Merkle tree (ie the order of nodes in the hash, |
| 88 | +how version information is encoded, etc.), is left to the implementer, but this |
| 89 | +algorithm should be documented in a [POUF](https://github.com/theupdateframework/taps/blob/master/tap11.md) |
| 90 | +so that implementations can be |
| 91 | +compatible and correctly verify Merkle tree data. However, all implementations |
| 92 | +should meet the following requirements: |
| 93 | +* Leaf nodes must be unique. A unique identifier of the targets metadata – such as the |
| 94 | +filepath, filename, or the hash of the content – must be included in the leaf data to ensure that no two leaf |
| 95 | +node hashes are the same. |
| 96 | +* The tree must be a Merkle tree. Each internal node must contain a hash that |
| 97 | +includes both child nodes. |
| 98 | + |
| 99 | +Once the Merkle tree is generated, the repository must create a snapshot Merkle |
| 100 | +metadata file for each targets metadata file. This file must contain the leaf contents and |
| 101 | +the path to the root of the Merkle tree. This path must contain the hashes of |
| 102 | +nodes needed to reconstruct the tree during verification, including the leaf's |
| 103 | +sibling (see diagram). |
| 104 | +In addition the path should contain direction information so that the client |
| 105 | +will know whether each listed node is a left or right sibling when reconstructing the |
| 106 | +tree. |
| 107 | + |
| 108 | +This information will be included in the following metadata format: |
| 109 | +``` |
| 110 | +{ “leaf_contents”: {METAFILES}, |
| 111 | + “merkle_path”: {INDEX:HASH} |
| 112 | + “path_directions”:{INDEX:DIR} |
| 113 | +} |
| 114 | +``` |
| 115 | + |
| 116 | +Where `METAFILES` is the version information as defined for snapshot metadata, |
| 117 | +`INDEX` provides the ordering of nodes, `HASH` is the hash of the sibling node, |
| 118 | +and `DIR` indicates whether the given node is a left or right sibling. |
| 119 | + |
| 120 | +In addition, the following optional field will be added to timestamp metadata. |
| 121 | +If this field is included, the client should use snapshot Merkle metadata to |
| 122 | +verify updates instead: |
| 123 | + |
| 124 | +``` |
| 125 | +("merkle_root": ROOT_HASH) |
| 126 | +``` |
| 127 | + |
| 128 | +Where `ROOT_HASH` is the hash of the Merkle tree's root node. |
| 129 | + |
| 130 | +Note that snapshot Merkle metadata files do not need to be signed by a snapshot |
| 131 | +key because the path information will be verified based on the Merkle root |
| 132 | +provided in timestamp. Removing these signatures will provide additional space |
| 133 | +savings for clients. |
| 134 | + |
| 135 | +Previous versions of snapshot Merkle metadata files using the current timestamp |
| 136 | +key must remain available to clients and auditors. The repository may store |
| 137 | +snapshot Merkle metadata files using consistent snapshots to facilitate |
| 138 | +access to previous Merkle trees. |
| 139 | + |
| 140 | +## Merkle tree verification |
| 141 | + |
| 142 | +If a client sees the `merkle_root` field in timestamp metadata, they will use |
| 143 | +the snapshot Merkle metadata to check version information. If this field is |
| 144 | +present, the client will download the snapshot Merkle metadata file only for |
| 145 | +the targets metadata the client is attempting to update. The client will verify the |
| 146 | +snapshot Merkle metadata file by reconstructing the Merkle tree and comparing |
| 147 | +the computed root hash to the hash provided in timestamp metadata. If the |
| 148 | +hashes do not match, the snapshot Merkle metadata is invalid. Otherwise, the |
| 149 | +client will use the version information in the verified snapshot Merkle |
| 150 | +metadata to proceed with the update. |
| 151 | + |
| 152 | +For additional rollback protection, the client may download previous versions |
| 153 | +of the snapshot Merkle metadata for the given targets metadata file. The client |
| 154 | +should perform this check immediately after verifying the current Merkle tree. After verifying |
| 155 | +these files, the client should compare the version information in the previous |
| 156 | +Merkle trees to the information in the current Merkle tree to ensure that the |
| 157 | +version numbers have never decreased. In order to allow for fast forward attack |
| 158 | +recovery (discussed further in Security Analysis), the client should only |
| 159 | +download previous versions whose root hashes were signed for with the same timestamp key. |
| 160 | + |
| 161 | +## Auditing Merkle trees |
| 162 | + |
| 163 | +In order to ensure the validity of all targets metadata version information in the |
| 164 | +Merkle tree, third-party auditors should validate the entire tree each time it |
| 165 | +is updated. Auditors should download every snapshot Merkle file, verify the |
| 166 | +paths, check the root hash against the hash provided in timestamp metadata, |
| 167 | +and ensure that the version information has not decreased for each leaf. |
| 168 | +Alternatively, the repository may provide auditors with information about the |
| 169 | +contents and ordering of leaf nodes so that the auditors can more efficiently |
| 170 | +verify the entire tree. |
| 171 | + |
| 172 | +An auditor should validate all versions of the Merkle tree signed by the |
| 173 | +current timestamp key. For fast-forward attack recovery, the auditor should |
| 174 | +not check for a rollback attack after the timestamp key |
| 175 | +has been replaced. This means that all new auditors should check the Merkle |
| 176 | +trees signed with the current timestamp keys before attesting to the validity |
| 177 | +of the current Merkle tree. |
| 178 | + |
| 179 | +## Client interaction with auditors |
| 180 | + |
| 181 | +Clients must ensure that snapshot Merkle trees have been verified by an auditor. |
| 182 | +To do so, implementations may use a few different mechanisms: |
| 183 | + |
| 184 | +* Auditors may provide an additional signature for timestamp metadata that |
| 185 | +indicates that they have verified the contents of the Merkle tree whose root |
| 186 | +is in that timestamp file. Using this signature, clients can check whether a |
| 187 | +particular third party has approved the Merkle tree. To use this mechanism, |
| 188 | +the auditor's key should be included in the root metadata. |
| 189 | + |
| 190 | +* Auditors may host a list of verified Merkle roots for a given repository, |
| 191 | +signed by the auditor's key. Clients may be configured with the auditor's key, |
| 192 | +or get it from the root metadata. |
| 193 | + |
| 194 | +* Clients may use a secure API to verify that a given Merkle root has been |
| 195 | +verified by an auditor. This API should provide compromise resilience similar to |
| 196 | +TUF's root metadata. |
| 197 | + |
| 198 | +## Garbage collection |
| 199 | + |
| 200 | +When a threshold of timestamp keys are revoked and replaced, the repository no |
| 201 | +longer needs to store snapshot Merkle files signed by the previous timestamp |
| 202 | +keys. Replacing the timestamp keys is an opportunity for fast forward attack |
| 203 | +recovery, and so all version information from before the replacement is no |
| 204 | +longer valid. At this point, the repository may garbage collect all snapshot |
| 205 | +Merkle metadata files. |
| 206 | + |
| 207 | +# Security Analysis |
| 208 | + |
| 209 | +This proposal impacts the snapshot metadata, so this section will discuss the |
| 210 | +attacks that are mitigated by snapshot metadata in TUF. |
| 211 | + |
| 212 | +## Rollback attack |
| 213 | + |
| 214 | +A rollback attack provides a client with an old, previously valid view of |
| 215 | +the repository. Using this attack, an attacker could convince a client to |
| 216 | +install a version from before a security patch was released. |
| 217 | + |
| 218 | +TUF currently protects against rollback attacks by checking the current time |
| 219 | +signed by timestamp and ensuring that no version information provided by |
| 220 | +snapshot has decreased since the last update. With both of these protections, |
| 221 | +a client that has a copy of trusted metadata is secure against a rollback |
| 222 | +attack to any version released |
| 223 | +before the previous update cycle, even if the timestamp and snapshot keys |
| 224 | +are compromised. |
| 225 | + |
| 226 | +Using snapshot Merkle trees, rollback attacks are prevented by both the |
| 227 | +client verification and by third party auditors. If no keys are compromised, |
| 228 | +the timestamp keys protect against a rollback attack by ensuring a valid |
| 229 | +snapshot Merkle tree. If the timestamp key is compromised, the client |
| 230 | +verification of previous Merkle trees provides rollback protection for the |
| 231 | +individual targets metadata files that are verified. However, if the attacker |
| 232 | +controls the repository and timestamp keys, they may provide malicious previous |
| 233 | +Merkle trees. For full rollback protection, clients rely on third party |
| 234 | +auditors. Third party auditors store the previous version of |
| 235 | +all metadata, and will detect when the version number decreases in a new |
| 236 | +Merkle tree. As long as the client checks for an auditor’s verification, the |
| 237 | +client will not install the rolled-back version of the target. |
| 238 | + |
| 239 | +In summary, without auditors, a client is vulnerable to rollback attacks when an attacker |
| 240 | +controls the timestamp key. With auditors, the client has the same rollback |
| 241 | +protection as the existing TUF specification. |
| 242 | + |
| 243 | +## Fast forward attack |
| 244 | + |
| 245 | +If an attacker is able to compromise the timestamp key, they may arbitrarily |
| 246 | +increase the version number of a target in the snapshot Merkle metadata. If |
| 247 | +they increase it to a sufficiently large number (say the maximum integer value), |
| 248 | +the client will not accept any future version of the target as the version |
| 249 | +number will be below the previous version. |
| 250 | + |
| 251 | +In the current specification, repositories can recover from a fast forward |
| 252 | +attack by replacing a threshold of timestamp keys. If the client sees that |
| 253 | +a threshold of timestamp keys were replaced, it deletes the currently trusted |
| 254 | +version information. |
| 255 | + |
| 256 | +Snapshot Merkle trees also reset snapshot information after a replacement of |
| 257 | +a threshold of timestamp keys in order to recover from fast forward attacks. |
| 258 | +Auditors and clients should not check version information from before a |
| 259 | +timestamp key replacement when verifying the Merkle tree. |
| 260 | + |
| 261 | +Thus, fast forward attack recovery with snapshot Merkle trees is the same |
| 262 | +as in the existing specification, but must be performed by both clients and |
| 263 | +auditors. |
| 264 | + |
| 265 | +## Mix and match attack |
| 266 | + |
| 267 | +In a mix and match attack, an attacker combines images from the current |
| 268 | +snapshot with images from other snapshots, potentially introducing |
| 269 | +vulnerabilities. |
| 270 | + |
| 271 | +Currently, TUF protects against mix and match attacks by providing a snapshot |
| 272 | +metadata file that contains all targets metadata files available on the |
| 273 | +repository. Therefore, a mix and match attack is only possible in an |
| 274 | +attacker is able to compromise the timestamp and snapshot keys to create |
| 275 | +a malicious snapshot metadata file. |
| 276 | + |
| 277 | +A snapshot Merkle tree prevents mix and match attacks by ensuring that all |
| 278 | +targets files installed come from the same snapshot Merkle tree. If all targets |
| 279 | +have version information in the same snapshot Merkle tree, the properties of |
| 280 | +secure hash functions ensure that these versions were part of the same snapshot. |
| 281 | +As in the existing specification, a mix and match attack would be possible |
| 282 | +if an attacker was able to replace the snapshot Merkle tree using compromised |
| 283 | +timestamp keys. |
| 284 | + |
| 285 | +Snapshot Merkle trees provide the same protection against mix and match attacks |
| 286 | +as the existing specification. |
| 287 | + |
| 288 | + |
| 289 | +# Backwards Compatibility |
| 290 | + |
| 291 | +This TAP is not backwards compatible. The following table describes |
| 292 | +compatibility for clients and repositories. |
| 293 | + |
| 294 | +| Parties that support snapshot Merkle trees | Result | |
| 295 | +| ------------------------------------------ | ------ | |
| 296 | +| Client and repository support this TAP | Client and repository are compatible | |
| 297 | +| Client supports this TAP, but repository does not | Client and repository are compatible. The timestamp metadata provided by the repository will never contain the `merkle_root` field, and so the client will not look for snapshot Merkle metadata. | |
| 298 | +| Repository supports this TAP, but client does not | Client and repository are not compatible. If the repository uses snapshot Merkle metadata, the client will not recognise the `merkle_root` field as valid. | |
| 299 | +| Neither client nor repository supports this TAP | Client and repository are compatible | |
| 300 | + |
| 301 | +# Augmented Reference Implementation |
| 302 | + |
| 303 | +https://github.com/theupdateframework/tuf/pull/1113/ |
| 304 | +TODO: auditor implementation |
| 305 | + |
| 306 | +# Copyright |
| 307 | + |
| 308 | +This document has been placed in the public domain. |
0 commit comments