Skip to content

Commit 49dcf1d

Browse files
authored
Merge pull request #125 from mnm678/snapshot-merkle
Add TAP introducing snapshot Merkle trees
2 parents bce2f69 + b4f7f80 commit 49dcf1d

File tree

3 files changed

+309
-0
lines changed

3 files changed

+309
-0
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
* [TAP 12: Improving keyid flexibility](tap12.md)
1818
* [TAP 14: Managing TUF Versions](tap14.md)
1919
* [TAP 15: Succinct hashed bin delegations](tap15.md)
20+
* [TAP 16: Snapshot Merkle Trees](tap16.md)
2021

2122
## Rejected
2223

merkletap-1.jpg

30.8 KB
Loading

tap16.md

+308
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,308 @@
1+
* TAP: 16
2+
* Title: Snapshot Merkle Trees
3+
* Version: 0
4+
* Last-Modified: 22/01/2021
5+
* Author: Marina Moore, Justin Cappos
6+
* Type: Standardization
7+
* Status: Draft
8+
* Content-Type: markdown
9+
* Created: 14/09/2020
10+
* +TUF-Version:
11+
* +Post-History:
12+
13+
# Abstract
14+
15+
Snapshot metadata for repositories with a high number of targets
16+
metadata files (through significant use of delegations), can become
17+
prohibitively large. Due to the need to download the snapshot file on every
18+
update cycle, very large snapshot metadata files can become a significant
19+
overhead for TUF clients.
20+
21+
This TAP proposes a method for reducing the size of snapshot metadata a client
22+
must download without significantly weakening the security properties of TUF.
23+
24+
25+
# Motivation
26+
27+
For very large repositories, the snapshot metadata file could get very large.
28+
This snapshot metadata file must be downloaded on every update cycle, and so
29+
could significantly impact the metadata overhead. For example, if a repository
30+
has 50,000,000 targets metadata files, the snapshot metadata will be about
31+
380,000,000 bytes (https://docs.google.com/spreadsheets/d/18iwWnWvAAZ4In33EWJBgdAWVFE720B_z0eQlB4FpjNc/edit?ts=5ed7d6f4#gid=0).
32+
For this reason, it is necessary to create a more scalable solution for snapshot
33+
metadata that does not significantly impact the security properties of TUF.
34+
35+
We designed a new approach to snapshot that improves scalability while
36+
achieving similar security properties to the existing snapshot metadata.
37+
Using this new approach, a repository with 50,000,000 targets metadata files
38+
would only require the user to download about 800 bytes of snapshot metadata (https://docs.google.com/spreadsheets/d/18iwWnWvAAZ4In33EWJBgdAWVFE720B_z0eQlB4FpjNc/edit?ts=5ed7d6f4#gid=924553486).
39+
40+
41+
# Rationale
42+
43+
Snapshot metadata provides a consistent view of the repository in order to
44+
protect against mix-and-match attacks and rollback attacks. In order to provide
45+
these protections, snapshot metadata is responsible for keeping track of the
46+
version number of each targets metadata file, ensuring that all targets downloaded are
47+
from the same snapshot, and ensuring that no targets metadata file decreases its version
48+
number (except in the case of fast forward attack recovery). Any new solution
49+
we develop must provide these same protections.
50+
51+
A snapshot Merkle tree manages version information for each targets metadata
52+
file by including this information in a leaf node for each targets metadata
53+
file. By using a Merkle tree to store these nodes,
54+
this proposal can cryptographically verify that different targets are from the
55+
same snapshot by ensuring that the Merkle tree roots match. Due to the
56+
properties of secure hash functions, any two leaves of a Merkle tree with the
57+
same root are from the same tree.
58+
59+
In order to prevent rollback attacks between Merkle trees, this proposal
60+
introduces third-party auditors. These auditors are responsible for downloading
61+
all nodes of each Merkle tree to ensure that no version numbers have decreased
62+
between generated trees. This achieves rollback protection without every client
63+
having to store the version information for every targets metadata file.
64+
65+
# Specification
66+
67+
This proposal replaces the single snapshot metadata file with a snapshot Merkle
68+
metadata file for each targets metadata file. The repository generates these
69+
snapshot Merkle metadata files by building a Merkle tree using all targets
70+
metadata files and storing the path to each targets metadata file in the
71+
snapshot Merkle metadata. The root of this Merkle tree is stored in timestamp
72+
metadata to allow for client verification. The client uses the path stored in
73+
the snapshot Merkle metadata for a targets metadata file, along
74+
with the root of the Merkle tree, to ensure that metadata is from the given
75+
Merkle tree. The details of these files and procedures are described in
76+
this section.
77+
78+
![Diagram of snapshot Merkle tree](merkletap-1.jpg)
79+
80+
## Merkle tree generation
81+
82+
When the repository generates snapshot metadata, instead of putting the version
83+
information for all targets metadata files into a single file, it instead uses the version
84+
information to generate a Merkle tree. Each targets metadata file's version information forms
85+
a leaf of the tree, then these leaves are used to build a Merkle tree. The
86+
internal nodes of a Merkle tree contain the hash of their child nodes. The exact
87+
algorithm for generating this Merkle tree (ie the order of nodes in the hash,
88+
how version information is encoded, etc.), is left to the implementer, but this
89+
algorithm should be documented in a [POUF](https://github.com/theupdateframework/taps/blob/master/tap11.md)
90+
so that implementations can be
91+
compatible and correctly verify Merkle tree data. However, all implementations
92+
should meet the following requirements:
93+
* Leaf nodes must be unique. A unique identifier of the targets metadata – such as the
94+
filepath, filename, or the hash of the content – must be included in the leaf data to ensure that no two leaf
95+
node hashes are the same.
96+
* The tree must be a Merkle tree. Each internal node must contain a hash that
97+
includes both child nodes.
98+
99+
Once the Merkle tree is generated, the repository must create a snapshot Merkle
100+
metadata file for each targets metadata file. This file must contain the leaf contents and
101+
the path to the root of the Merkle tree. This path must contain the hashes of
102+
nodes needed to reconstruct the tree during verification, including the leaf's
103+
sibling (see diagram).
104+
In addition the path should contain direction information so that the client
105+
will know whether each listed node is a left or right sibling when reconstructing the
106+
tree.
107+
108+
This information will be included in the following metadata format:
109+
```
110+
{ “leaf_contents”: {METAFILES},
111+
“merkle_path”: {INDEX:HASH}
112+
“path_directions”:{INDEX:DIR}
113+
}
114+
```
115+
116+
Where `METAFILES` is the version information as defined for snapshot metadata,
117+
`INDEX` provides the ordering of nodes, `HASH` is the hash of the sibling node,
118+
and `DIR` indicates whether the given node is a left or right sibling.
119+
120+
In addition, the following optional field will be added to timestamp metadata.
121+
If this field is included, the client should use snapshot Merkle metadata to
122+
verify updates instead:
123+
124+
```
125+
("merkle_root": ROOT_HASH)
126+
```
127+
128+
Where `ROOT_HASH` is the hash of the Merkle tree's root node.
129+
130+
Note that snapshot Merkle metadata files do not need to be signed by a snapshot
131+
key because the path information will be verified based on the Merkle root
132+
provided in timestamp. Removing these signatures will provide additional space
133+
savings for clients.
134+
135+
Previous versions of snapshot Merkle metadata files using the current timestamp
136+
key must remain available to clients and auditors. The repository may store
137+
snapshot Merkle metadata files using consistent snapshots to facilitate
138+
access to previous Merkle trees.
139+
140+
## Merkle tree verification
141+
142+
If a client sees the `merkle_root` field in timestamp metadata, they will use
143+
the snapshot Merkle metadata to check version information. If this field is
144+
present, the client will download the snapshot Merkle metadata file only for
145+
the targets metadata the client is attempting to update. The client will verify the
146+
snapshot Merkle metadata file by reconstructing the Merkle tree and comparing
147+
the computed root hash to the hash provided in timestamp metadata. If the
148+
hashes do not match, the snapshot Merkle metadata is invalid. Otherwise, the
149+
client will use the version information in the verified snapshot Merkle
150+
metadata to proceed with the update.
151+
152+
For additional rollback protection, the client may download previous versions
153+
of the snapshot Merkle metadata for the given targets metadata file. The client
154+
should perform this check immediately after verifying the current Merkle tree. After verifying
155+
these files, the client should compare the version information in the previous
156+
Merkle trees to the information in the current Merkle tree to ensure that the
157+
version numbers have never decreased. In order to allow for fast forward attack
158+
recovery (discussed further in Security Analysis), the client should only
159+
download previous versions whose root hashes were signed for with the same timestamp key.
160+
161+
## Auditing Merkle trees
162+
163+
In order to ensure the validity of all targets metadata version information in the
164+
Merkle tree, third-party auditors should validate the entire tree each time it
165+
is updated. Auditors should download every snapshot Merkle file, verify the
166+
paths, check the root hash against the hash provided in timestamp metadata,
167+
and ensure that the version information has not decreased for each leaf.
168+
Alternatively, the repository may provide auditors with information about the
169+
contents and ordering of leaf nodes so that the auditors can more efficiently
170+
verify the entire tree.
171+
172+
An auditor should validate all versions of the Merkle tree signed by the
173+
current timestamp key. For fast-forward attack recovery, the auditor should
174+
not check for a rollback attack after the timestamp key
175+
has been replaced. This means that all new auditors should check the Merkle
176+
trees signed with the current timestamp keys before attesting to the validity
177+
of the current Merkle tree.
178+
179+
## Client interaction with auditors
180+
181+
Clients must ensure that snapshot Merkle trees have been verified by an auditor.
182+
To do so, implementations may use a few different mechanisms:
183+
184+
* Auditors may provide an additional signature for timestamp metadata that
185+
indicates that they have verified the contents of the Merkle tree whose root
186+
is in that timestamp file. Using this signature, clients can check whether a
187+
particular third party has approved the Merkle tree. To use this mechanism,
188+
the auditor's key should be included in the root metadata.
189+
190+
* Auditors may host a list of verified Merkle roots for a given repository,
191+
signed by the auditor's key. Clients may be configured with the auditor's key,
192+
or get it from the root metadata.
193+
194+
* Clients may use a secure API to verify that a given Merkle root has been
195+
verified by an auditor. This API should provide compromise resilience similar to
196+
TUF's root metadata.
197+
198+
## Garbage collection
199+
200+
When a threshold of timestamp keys are revoked and replaced, the repository no
201+
longer needs to store snapshot Merkle files signed by the previous timestamp
202+
keys. Replacing the timestamp keys is an opportunity for fast forward attack
203+
recovery, and so all version information from before the replacement is no
204+
longer valid. At this point, the repository may garbage collect all snapshot
205+
Merkle metadata files.
206+
207+
# Security Analysis
208+
209+
This proposal impacts the snapshot metadata, so this section will discuss the
210+
attacks that are mitigated by snapshot metadata in TUF.
211+
212+
## Rollback attack
213+
214+
A rollback attack provides a client with an old, previously valid view of
215+
the repository. Using this attack, an attacker could convince a client to
216+
install a version from before a security patch was released.
217+
218+
TUF currently protects against rollback attacks by checking the current time
219+
signed by timestamp and ensuring that no version information provided by
220+
snapshot has decreased since the last update. With both of these protections,
221+
a client that has a copy of trusted metadata is secure against a rollback
222+
attack to any version released
223+
before the previous update cycle, even if the timestamp and snapshot keys
224+
are compromised.
225+
226+
Using snapshot Merkle trees, rollback attacks are prevented by both the
227+
client verification and by third party auditors. If no keys are compromised,
228+
the timestamp keys protect against a rollback attack by ensuring a valid
229+
snapshot Merkle tree. If the timestamp key is compromised, the client
230+
verification of previous Merkle trees provides rollback protection for the
231+
individual targets metadata files that are verified. However, if the attacker
232+
controls the repository and timestamp keys, they may provide malicious previous
233+
Merkle trees. For full rollback protection, clients rely on third party
234+
auditors. Third party auditors store the previous version of
235+
all metadata, and will detect when the version number decreases in a new
236+
Merkle tree. As long as the client checks for an auditor’s verification, the
237+
client will not install the rolled-back version of the target.
238+
239+
In summary, without auditors, a client is vulnerable to rollback attacks when an attacker
240+
controls the timestamp key. With auditors, the client has the same rollback
241+
protection as the existing TUF specification.
242+
243+
## Fast forward attack
244+
245+
If an attacker is able to compromise the timestamp key, they may arbitrarily
246+
increase the version number of a target in the snapshot Merkle metadata. If
247+
they increase it to a sufficiently large number (say the maximum integer value),
248+
the client will not accept any future version of the target as the version
249+
number will be below the previous version.
250+
251+
In the current specification, repositories can recover from a fast forward
252+
attack by replacing a threshold of timestamp keys. If the client sees that
253+
a threshold of timestamp keys were replaced, it deletes the currently trusted
254+
version information.
255+
256+
Snapshot Merkle trees also reset snapshot information after a replacement of
257+
a threshold of timestamp keys in order to recover from fast forward attacks.
258+
Auditors and clients should not check version information from before a
259+
timestamp key replacement when verifying the Merkle tree.
260+
261+
Thus, fast forward attack recovery with snapshot Merkle trees is the same
262+
as in the existing specification, but must be performed by both clients and
263+
auditors.
264+
265+
## Mix and match attack
266+
267+
In a mix and match attack, an attacker combines images from the current
268+
snapshot with images from other snapshots, potentially introducing
269+
vulnerabilities.
270+
271+
Currently, TUF protects against mix and match attacks by providing a snapshot
272+
metadata file that contains all targets metadata files available on the
273+
repository. Therefore, a mix and match attack is only possible in an
274+
attacker is able to compromise the timestamp and snapshot keys to create
275+
a malicious snapshot metadata file.
276+
277+
A snapshot Merkle tree prevents mix and match attacks by ensuring that all
278+
targets files installed come from the same snapshot Merkle tree. If all targets
279+
have version information in the same snapshot Merkle tree, the properties of
280+
secure hash functions ensure that these versions were part of the same snapshot.
281+
As in the existing specification, a mix and match attack would be possible
282+
if an attacker was able to replace the snapshot Merkle tree using compromised
283+
timestamp keys.
284+
285+
Snapshot Merkle trees provide the same protection against mix and match attacks
286+
as the existing specification.
287+
288+
289+
# Backwards Compatibility
290+
291+
This TAP is not backwards compatible. The following table describes
292+
compatibility for clients and repositories.
293+
294+
| Parties that support snapshot Merkle trees | Result |
295+
| ------------------------------------------ | ------ |
296+
| Client and repository support this TAP | Client and repository are compatible |
297+
| Client supports this TAP, but repository does not | Client and repository are compatible. The timestamp metadata provided by the repository will never contain the `merkle_root` field, and so the client will not look for snapshot Merkle metadata. |
298+
| Repository supports this TAP, but client does not | Client and repository are not compatible. If the repository uses snapshot Merkle metadata, the client will not recognise the `merkle_root` field as valid. |
299+
| Neither client nor repository supports this TAP | Client and repository are compatible |
300+
301+
# Augmented Reference Implementation
302+
303+
https://github.com/theupdateframework/tuf/pull/1113/
304+
TODO: auditor implementation
305+
306+
# Copyright
307+
308+
This document has been placed in the public domain.

0 commit comments

Comments
 (0)