Skip to content

Commit 6b08237

Browse files
adityasakyrenatavEricson2314
authored
Add TAP supporting content addressable systems (#156)
Signed-off-by: Aditya Sirish <[email protected]> Co-authored-by: Renata Vaderna <[email protected]> Co-authored-by: John Ericson <[email protected]>
1 parent c09b344 commit 6b08237

File tree

3 files changed

+613
-0
lines changed

3 files changed

+613
-0
lines changed

POUFs/TAF-POUF/pouf2.md

+140
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
* POUF:
2+
* Title: The Archive Framework
3+
* Version: 1
4+
* Last-Modified:
5+
* Author: Renata Vaderna
6+
* Status: Draft
7+
* TUF Version Implemented:
8+
* Implementation Version(s) Covered:
9+
* Content-Type: text/markdown
10+
* Created:
11+
12+
# Abstract
13+
14+
This POUF describes the protocol, operations, usage, and formats for the
15+
implementation of TUF designed to distribute Git repositories. This instance is
16+
known as The Archive Framework or TAF and leverages TAP-19 that adds support in
17+
TUF for content addressable artifacts and their native hashing routines.
18+
19+
# Protocol
20+
21+
This POUF currently uses a subset of the JSON object format, with floating-point
22+
numbers omitted. When calculating the digest of an object, we use the
23+
["canonical JSON" subdialect](http://wiki.laptop.org/go/Canonical_JSON) and
24+
implemented in securesystemslib. As TAF uses the TUF reference implementation
25+
for metadata generation, it implicitly depends on the reference implementation
26+
POUF’s protocols.
27+
28+
Metadata and target files are stored in a git repository referred to as an
29+
authentication repository. An authentication repository contains information
30+
needed to securely clone and update other git repositories (referred to as
31+
target repositories), including their URLs and additional custom data that can
32+
be used by TAF's implementers. This is specified in special target files,
33+
`repositories.json` and `mirrors.json` - regular TUF target files (whose
34+
lengths and hashes are stored in `targets.json` and which are signed by the
35+
top-level targets role) which are of special importance to TAF.
36+
37+
TAF focuses on protecting git repositories from unauthorized pushes. It is
38+
designed to record valid commits for target repositories in the authentication
39+
repository. If an attacker manages to compromise a user who has write access to
40+
protected repositories and creates new commits, TAF will detect this by
41+
comparing these new commits to a list of valid commits specified in the
42+
authentication repository. In order to register these new commits, the attacker
43+
has to modify TUF metadata files as well. Unless they also gain access to
44+
`targets` or `root` keys, they cannot do this without the attempt being
45+
detected by TAF. TAF's users are encouraged to store signing keys for the
46+
`target` and `root` roles offline on hardware tokens to ensure their safety.
47+
48+
In essence, while TAF ensures the validity of commits in target repositories,
49+
it makes no claims about the integrity of their contents. TAF relies on Git's
50+
default artifact integrity protection. Git, however, still primarily relies on
51+
SHA-1, even though it has been proven that this hash function is vulnerable to
52+
collision attacks. If an attacker manages to gain access to the original target
53+
repositories, they could potentially exploit the SHA-1 weakness. On the other hand,
54+
authentication repositories store lists of valid URLs of target repositories,
55+
ensuring that users are protected from _mirrors_ of legitimate repositories presenting
56+
a colliding artifact in place of the original artifact. These URLs are defined manually
57+
and can only be modified by someone who has a `targets` key. If a target repository
58+
is not owned by the same person or organization that is setting up an authentication
59+
repository, it is a hard requirement to directly contact the owner who will be
60+
able to confirm authenticity of the repository in question.
61+
62+
In order to take advantage of TAF's validations, a client can download an
63+
authentication repository and all of the referenced repositories by running
64+
TAF’s updater and specifying the authentication repository’s URL. Repositories
65+
are cloned and updated using git. The updater runs validation before
66+
permanently storing anything on the client’s machine:
67+
* An authentication repository is cloned as a bare repository inside the
68+
user’s temp directory (so no worktree is checked out)
69+
* This repository is then validated. Metadata and target files are read using
70+
`git show`
71+
* If validation is successful, the repository is cloned/new changes are
72+
pulled, once again using Git
73+
74+
75+
# Operations
76+
77+
WIP
78+
79+
# Usage
80+
81+
In order to use the system, it is necessary to set up an authentication
82+
repository - initialize a TUF repository (generate TUF metadata files and sign
83+
them using offline keys, either loaded from the filesystem or YubiKeys) and
84+
commit the changes. TAF contains a command line interface which can be used to
85+
create and update authentication repositories (add new target files and signing
86+
keys, extend expiration dates of metadata files, generate keystore files and
87+
set up YubiKeys). For example, a new authentication repository can be created
88+
using the `taf repo create repo_path` command. Detailed explanation and
89+
instructions are available in the official documentation
90+
https://github.com/openlawlibrary/taf/blob/master/docs/quickstart.md.
91+
92+
TAF's main purpose is to provide archival authentication, which means that it
93+
is not only the current state of the repositories that is validated -
94+
correctness of all past versions needs to be checked as well. A state is valid
95+
if the authentication repository at a certain revision (commit) is a valid TUF
96+
repository and target repositories are valid according to the data defined in
97+
the authentication repository. So, if `targets.json` is updated,
98+
`snapshot.json` and `timestamp.json` need to be updated in the same commit or
99+
the repository will not be valid. This ensures that a client cannot check out
100+
an invalid version of the repository. Moreover, validation of the
101+
authentication repository also ensures that versions of metadata files in older
102+
revisions are lower than in newer revisions. Once the metadata and target
103+
files are validated, they are used to check correctness of the referenced git
104+
repositories - do actual commits in those repositories correspond to the
105+
commits listed in the authentication repository.
106+
107+
To sum up, TAF extends the reference implementation by storing the metadata and
108+
target files in a git repository and making sure that all changes are committed
109+
after every valid update.
110+
111+
# Formats
112+
113+
The metadata generated to support Git repositories are largely the same as
114+
those described in the TUF specification and POUF-1. The key difference is in
115+
the enumeration of Targets.
116+
117+
Firstly, while TUF identifies individual targets by their location relative to
118+
the mirror’s base URL, this POUF uses a URI to identify the specific Git
119+
namespace in a target repository. This URI is also used to locate the
120+
repository itself. The URI must use `git` as the scheme, clearly indicating
121+
that the entry pertains to a Git object.
122+
123+
For metadata of some repository, a Targets entry is expected to map to a
124+
specific Git branch or tag. For each entry, whether a branch or a tag, a hash
125+
value must be recorded that clearly identifies the expected commit at the
126+
corresponding Git reference. As per TAP-19 that adds support for content
127+
addressable systems and their native hashing routines, instead of calculating
128+
the hash afresh for a particular Git reference, the identifier of the commit at
129+
the tip of the branch or that the tag points to must be used.
130+
131+
Apart from the hashes and custom field, a Targets entry is also expected to
132+
record the length of the artifact. This length is vital in avoiding endless
133+
data attacks. However, there is no clear mapping of a length for a particular
134+
commit object. Therefore, for Git specific implementations, the length field
135+
may be omitted. However, implementations are free to choose sane limits for how
136+
much data is fetched when pulling from a Git repository.
137+
138+
# Security Audit
139+
140+
None yet.

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
* [TAP 16: Snapshot Merkle Trees](tap16.md)
2222
* [TAP 17: Remove Signature Wrapper from the TUF Specification](tap17.md)
2323
* [TAP 18: Ephemeral identity verification using sigstore's Fulcio for TUF developer key management](tap18.md)
24+
* [TAP 19: Content Addressable Systems and TUF](tap19.md)
2425

2526
## Rejected
2627

0 commit comments

Comments
 (0)