Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 5266: Do not parse the same TLS record fragments twice #240

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
350f3a6
Bug 5266: Do not parse the same TLS record fragments twice
rousskov Apr 18, 2023
75f04a1
Bug 5266: Do not parse the same TLS record fragments twice
eduard-bagdasaryan Dec 4, 2023
6ef59b6
Adjusted the solution after testing and removed stale code
eduard-bagdasaryan Dec 6, 2023
f3650cf
Undid an out-of-scope change
eduard-bagdasaryan Dec 6, 2023
095601c
Undone an out-of-scope change
eduard-bagdasaryan Dec 7, 2023
328b704
Simplified, removing try/catch from HandshakeParser::parseMessages()
eduard-bagdasaryan Dec 18, 2023
6932606
Moved tkMessages.commit() into individual parsers
eduard-bagdasaryan Dec 19, 2023
2180c95
Do not skip 'empty' message fragments
eduard-bagdasaryan Dec 21, 2023
32c24e5
Moved tkMessages.commit() back into parseMessages()
eduard-bagdasaryan Dec 21, 2023
c5f70c1
Reworked cycles over tkRecords and tkMessages
eduard-bagdasaryan Dec 26, 2023
9a96bc7
Polished method names and a description.
eduard-bagdasaryan Dec 26, 2023
eba1b8e
Polished with 'auto'.
eduard-bagdasaryan Dec 26, 2023
5d4b9b5
Removed a typo
eduard-bagdasaryan Dec 26, 2023
e304647
A couple of fixes
eduard-bagdasaryan Dec 28, 2023
3a88001
Added a debugs() and simplified parseModernRecord()
eduard-bagdasaryan Dec 29, 2023
71802a6
Call tkRecords.commit() calls from parsers to parseHello()
eduard-bagdasaryan Dec 29, 2023
dbf2481
Refactored HandshakeParser::parseMessages()
eduard-bagdasaryan Jan 4, 2024
6b12750
Simplified, removing ChangeCipherSpec class
eduard-bagdasaryan Jan 5, 2024
47eb627
Do not call tkMessages.atEnd() before rollback()
eduard-bagdasaryan Jan 5, 2024
4b525e4
Do not silently ignore empty messages in parseNonEmptyMessages()
eduard-bagdasaryan Jan 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions src/parser/BinaryTokenizer.h
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@ class BinaryTokenizer
/// this method avoids append overheads during incremental parsing
void reinput(const SBuf &data, const bool expectMore) { data_ = data; expectMore_ = expectMore; }

/// adds more data bytes to parse
void append(const SBuf &data) { data_.append(data); }

/// make progress: future parsing failures will not rollback beyond this point
void commit();

Expand Down Expand Up @@ -110,6 +113,11 @@ class BinaryTokenizer
/// debugging helper for parsed multi-field structures
void got(uint64_t size, const char *description) const;

/// whether more data bytes may arrive in the future
bool expectMore() const { return expectMore_; }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool expectMore() const { return expectMore_; }
auto expectingMore() const { return expectMore_; }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/// allow or prohibit arriving more data bytes in the future
void expectMore(bool val) { expectMore_ = val; }

const BinaryTokenizerContext *context; ///< debugging: thing being parsed

protected:
Expand Down
25 changes: 17 additions & 8 deletions src/security/Handshake.cc
Original file line number Diff line number Diff line change
Expand Up @@ -274,24 +274,30 @@ Security::HandshakeParser::parseModernRecord()
Must(record.fragment.length() || record.type == ContentType::ctApplicationData);

if (currentContentType != record.type) {
tkMessages.expectMore(false);
parseMessages();
Must(tkMessages.atEnd()); // no currentContentType leftovers
fragments = record.fragment;
currentContentType = record.type;
} else {
fragments.append(record.fragment);
}

if (tkRecords.atEnd() && !done)
rousskov marked this conversation as resolved.
Show resolved Hide resolved
parseMessages();
const auto haveUnparsedRecordBytes = !tkRecords.atEnd();
const auto expectMoreRecordLayerBytes = tkRecords.expectMore();
// TODO: consider adding BinaryTokenizer::exhausted() instead
const auto expectMoreMessageLayerBytes = haveUnparsedRecordBytes || expectMoreRecordLayerBytes;

tkMessages.expectMore(expectMoreMessageLayerBytes);
tkMessages.append(record.fragment);

parseMessages();
}

/// parses one or more "higher-level protocol" frames of currentContentType
void
Security::HandshakeParser::parseMessages()
{
tkMessages.reset(fragments, false);
for (; !tkMessages.atEnd(); tkMessages.commit()) {
rousskov marked this conversation as resolved.
Show resolved Hide resolved
tkMessages.rollback();

while (!tkMessages.atEnd() && !done) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to ensure somehow that tkMessages.commit() was called after each iteration.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 1: "Ensure that tkMessages.commit() was called" by parseMessageFoo()

I doubt that checking whether tkMessages.commit() was called is the best way forward because even if we, say, count the number of commits per iteration, and assert that there was at least one, we still would not protect ourselves from buggy code that commits the current message and then extracts/saves something from the next message (without committing).

Option 2: Move parsing to parseMessage()

The underlying/core problem here is that, with the current design, the loop code does not and, IMHO, should not know what happens inside each parseFoo() function it calls. There is an implicit assumption that each of those functions commits after parsing and before accumulation, but we cannot really "assert" that. To make this code safer, we could remove tkMessages manipulation from individual parseFooMessage() functions:

while (...) {
switch (currentContentType) {
case FooMessage:
    interpretFooMessage(parseMessage<FooMesasge>());
    continue;
case BarMessage:
    interpretBarMessage(parseMessage<BarMessage>());
    continue;
}
skipMessage(...);

where parseMessage() is a templated method that parses a single message, commits tkMessages state, and returns the parsed message object.

This would require creation of a ChangeCipherSpecMessage class to parse ctChangeCipherSpec messages.

However, I am not sure the above restructuring is the best way forward. It feels like it would make the code slightly more complex while only offering a small safety improvement -- an interpretFooMessage() function can still violate our expectations, of course; it would be just easier to spot those violations. This option also raises performance questions related to copying of parsed messages, although most of those copies may be eluded by a C++17 compiler (I have not checked).

Option 3: Move tkMessages.commit() calls into the loop

We know that every parseFooMessage() function must parse a single message1 (or throw). If it parses less or more than that, it is buggy. If it accumulates something before finishing parsing, it is buggy. This loop cannot do anything to detect those bugs. However, this loop can assume correct parseFooMessage() implementation and commit() after calling parseFooMessage() or skipMessage()! If parseFooMessage() is buggy, committing at the loop level will not make things worse AFAICT or, at the very least, those bugs will not be the loop's fault.

When a successful parsing attempt is expected/required to commit, then performing that commit at the loop level is better because it reduces the probability that a parseFooMessage() function forgets to commit (in some cases, especially when parseFooMessage() delegates parsing to another method or a chain of methods). Committing at the loop level clarifies the code structure/intent, especially when loop-level commit() calls are paired with loop-level rollback() calls.

If you agree that Option 3 is the best way forward, please implement Option 3 by moving commit() calls from parseFooMessage() functions into the loop. Please use the for (tkMessages.rollback();...; tkMessages.commit()) loop header structure.

Option 4: Individual message parsers should commit()

This option does not really "ensure that tkMessages.commit() was called after each iteration". It is discussed here because it contradicts Option 3.

This option is already implemented in this PR because I suggested to do so. My suggestion was based on the following logic:

Alex: I suspect we should move this tkMessages.commit() call into individual parsers because (otherwise) their post-message-parsing exceptions may result in re-parsing of the same message.

I think I was wrong because I missed the fact that relevant InsufficientInput exceptions cannot happen here. There are two kinds of exceptions we need to consider in this context:

  • InsufficientInput: If parseMessageFoo() throws InsufficientInput while parsing MessageFoo, then parseMessageFoo() does not call commit(), so this case is irrelevant. If parseMessageFoo() successfully extracts MessageFoo and calls commit(), then any attempt to parse any FooMessage component cannot result in InsufficientInput exception because the corresponding component tokenizer must have false expectingMore() state -- we have parsed the entire MessageFoo, and no more message bytes can be coming!

  • All other exceptions: If parseMessageFoo() throws any other exception, before or after commit(), the whole handshake parser should stop. There will be no retries.

Footnotes

  1. A single message or an equivalent: We effectively treat fragments of unknown messages, fragments of application layer messages, and fragments of ChangeCipherSpec messages as messages because TLS does not give us the expected message size, and we do not want to accumulate those fragments (and add code to prevent excessive accumulation).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree that Option 3 is the best way forward

Yes, I think it is the best among other options that you outlined. Done (32c24e5).

switch (currentContentType) {
case ContentType::ctChangeCipherSpec:
parseChangeCipherCpecMessage();
Expand Down Expand Up @@ -335,6 +341,7 @@ Security::HandshakeParser::parseAlertMessage()
{
Must(currentContentType == ContentType::ctAlert);
const Alert alert(tkMessages);
tkMessages.commit();
debugs(83, (alert.fatal() ? 2:3),
"level " << static_cast<int>(alert.level) <<
" description " << static_cast<int>(alert.description));
Expand All @@ -349,6 +356,7 @@ Security::HandshakeParser::parseHandshakeMessage()
Must(currentContentType == ContentType::ctHandshake);

const Handshake message(tkMessages);
tkMessages.commit();

switch (message.msg_type) {
case HandshakeType::hskClientHello:
Expand Down Expand Up @@ -631,10 +639,11 @@ Security::HandshakeParser::parseSupportedVersionsExtension(const SBuf &extension
void
Security::HandshakeParser::skipMessage(const char *description)
{
// tkMessages/fragments can only contain messages of the same ContentType.
// tkMessages can only contain messages of the same ContentType.
// To skip a message, we can and should skip everything we have [left]. If
// we have partial messages, debugging will mislead about their boundaries.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that "debugging will mislead" comment is stale because all skipMessage() callers add foo [fragment] to the message foo description. We are skipping message(s) and/or message fragments here.

Suggested change
// we have partial messages, debugging will mislead about their boundaries.
// we buffered a partial message, we will need to read/skip multiple times.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

tkMessages.skip(tkMessages.leftovers().length(), description);
tkMessages.commit();
}

bool
Expand Down
3 changes: 0 additions & 3 deletions src/security/Handshake.h
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,6 @@ class HandshakeParser

const char *done; ///< not nil if we got what we were looking for

/// concatenated TLSPlaintext.fragments of TLSPlaintext.type
SBuf fragments;

/// TLS record layer (parsing uninterpreted data)
Parser::BinaryTokenizer tkRecords;

Expand Down
Loading