Bug 5266: Do not parse the same TLS record fragments twice #240

eduard-bagdasaryan · 2023-12-04T14:59:52Z

bio.cc(315) readAndParse: ... check failed: state < atHelloReceived
exception location: Handshake.cc(367) parseHandshakeMessage

This fix addresses this bug and also another one, discovered while
working on Bug 5266:

SSlBump may reject valid TLS handshake input when fragments of two
different record layer types (i.e. different TLSPlaintext::type
values) are separated by TCP I/O boundaries. Squid attempts to parse
the first set of fragments (e.g., containing ServerHello) twice, once
when the first set is received and then again when the first fragment
of the second set is received. The second attempt triggers an
exception depicted above. This bug was reported as Bug 5266.
SslBump rejects some valid TLS handshake input when a single
TLS message (e.g., ServerHello) is split across two different TLS
records and those two records are separated by a TCP I/O boundary.
Since commit e287364, Security::HandshakeParser assumes that
parseMessages() is called "only after accumulating all same-type TLS
records" but the code actually calls parseMessages() for every TLS
handshake record that ends at TCP I/O boundary (making
tkRecords.atEnd() true).

This change also limits accumulation of (specially crafted) TLS
fragments.

bio.cc(315) readAndParse: ... check failed: state < atHelloReceived exception location: Handshake.cc(367) parseHandshakeMessage This fix addresses the first of the two bugs described below. Unfortunately, the second bug makes this fix wrong. We will need to find a different fix or fix both bugs instead. * SSlBump may reject valid TLS handshake input when fragments of two different record layer types (i.e. different TLSPlaintext::type values) are separated by TCP I/O boundaries. Squid attempts to parse the first set of fragments (e.g., containing ServerHello) twice, once when the first set is received and then again when the first fragment of the second set is received. The second attempt triggers an exception depicted above. This bug was reported as Bug 5266. Resetting fragments avoids reparsing them. * SslBump probably rejects some valid TLS handshake input when a single TLS message (e.g., ServerHello) is split across two different TLS records and those two records are separated by a TCP I/O boundary. Since commit e287364, Security::HandshakeParser assumes that parseMessages() is called "only after accumulating all same-type TLS records" but the code actually calls parseMessages() for every TLS handshake record that ends at TCP I/O boundary (making tkRecords.atEnd() true). This bug has not been explicitly reported but was discovered while working on Bug 5266.

Reworked the initial solution introduced at 350f3a6, addressing both problems outlined there.

rousskov · 2023-12-11T16:04:35Z

src/security/Handshake.cc

        parseMessages();
        Must(tkMessages.atEnd()); // no currentContentType leftovers
-        fragments = record.fragment;
+        tkMessages.reset(record.fragment, true);


Why do we assume that more fragments are coming in this context? I would expect us to rely on tkRecords state instead: No more message bytes is coming if and only of tkRecords is empty and tkRecords.expectMore() returns false.

This code relies on a higher-level Security::HandshakeParser::parseHello(), that sets tkRecords to always 'expect more' until 'done'. However you are right that we should calculate this state instead.

rousskov · 2023-12-11T16:06:53Z

src/security/Handshake.cc

        currentContentType = record.type;
    } else {
-        fragments.append(record.fragment);
+        tkMessages.reset(tkMessages.leftovers().append(record.fragment), true);


I have a similar concern about unconditional true here. AFAICT, both parts of this if/else statement share the same "expecting more message bytes" condition (for record.type messages).

Done as above.

rousskov · 2023-12-11T16:08:44Z

src/security/Handshake.cc

-            parseApplicationDataMessage();
-            continue;
+    try {
+        for (; !tkMessages.atEnd(); tkMessages.expectMore(false), tkMessages.commit()) {


Why do we force tkMessages.expectMore() to false after each loop iteration here?

That is probably a leftover from the old code (setting false unconditionally). I removed it.

rousskov · 2023-12-11T16:18:23Z

src/security/Handshake.cc

+    catch (const Parser::BinaryTokenizer::InsufficientInput &) {
+        tkMessages.rollback();
+        Assure(tkMessages.expectMore());
+        debugs(83, 5, "need more data");
+        return;


I assume that this "wait for more data" try/catch is not applicable to the first parseMessages() caller. Please either delete it completely (if possible) or move it to the second parseMessages() caller (otherwise).

This try/catch was needed because BinaryTokenizer lacked an 'append()' method so we needed to rollback() before the makeshift reset(leftovers()+new_data). So I added append() and removed the try/catch.

rousskov · 2023-12-11T16:40:55Z

src/security/Handshake.cc

-        skipMessage("unknown ContentType msg [fragment]");
+    }
+    catch (const Parser::BinaryTokenizer::InsufficientInput &) {
+        tkMessages.rollback();


Please move this rollback() call to the start of the messages parsing (i.e. the beginning of this method). Otherwise, we are creating an impression that it is important to catch all InsufficuentInput exceptions to roll back the tokenizer -- something that is very easy to forget/miss because a lot of code may throw... In reality, we only need to roll back at the beginning of a parsing sequence, which usually has a much more well-defined/specific code location. This sketch illustrates a typical sequence for parsing multiple things:

tkMessages.rollback(); while (!done) ... parse one message ...

Please note that the above sketch also implies that the termination condition of this loop should not be atEnd()! There are two reasons to change that condition:

One of the benefits of having an InsufficientInput exception is that it is safe to parse anything without checking atEnd() -- the parser will do the right thing when there are not enough bytes. While it is not wrong to stop parsing atEnd(), doing so creates a false impression that we should check atEnd() before parsing anything or "things may break". That false assumption creates terrible code like this:

if (!tk.atEnd()) a = t.parseA(); if (!tk.atEnd() b = tk.parseB(); else ...

If we stop atEnd(), then parseMessages() (or, in the current PR code, the loop) has four outcomes instead of three:

parsing error (exception)

insufficient input (exception)

success (no exception; got everything we wanted)

insufficient input (no exception)

Removing the last bullet simplifies code logic a lot!

I followed your sketch and removed entirely the try/catch.

also calculate the 'expectMoreRecordBytes' state instead of assuming that it is always 'true'.

eduard-bagdasaryan

Done (328b704).

eduard-bagdasaryan · 2023-12-14T14:32:33Z

src/security/Handshake.cc

        parseMessages();
        Must(tkMessages.atEnd()); // no currentContentType leftovers
-        fragments = record.fragment;
+        tkMessages.reset(record.fragment, true);


This code relies on a higher-level Security::HandshakeParser::parseHello(), that sets tkRecords to always 'expect more' until 'done'. However you are right that we should calculate this state instead.

eduard-bagdasaryan · 2023-12-18T14:33:16Z

src/security/Handshake.cc

        currentContentType = record.type;
    } else {
-        fragments.append(record.fragment);
+        tkMessages.reset(tkMessages.leftovers().append(record.fragment), true);


Done as above.

eduard-bagdasaryan · 2023-12-18T14:34:33Z

src/security/Handshake.cc

-            parseApplicationDataMessage();
-            continue;
+    try {
+        for (; !tkMessages.atEnd(); tkMessages.expectMore(false), tkMessages.commit()) {


That is probably a leftover from the old code (setting false unconditionally). I removed it.

eduard-bagdasaryan · 2023-12-18T14:36:55Z

src/security/Handshake.cc

-        skipMessage("unknown ContentType msg [fragment]");
+    }
+    catch (const Parser::BinaryTokenizer::InsufficientInput &) {
+        tkMessages.rollback();


I followed your sketch and removed entirely the try/catch.

eduard-bagdasaryan · 2023-12-18T14:39:26Z

src/security/Handshake.cc

+    catch (const Parser::BinaryTokenizer::InsufficientInput &) {
+        tkMessages.rollback();
+        Assure(tkMessages.expectMore());
+        debugs(83, 5, "need more data");
+        return;


This try/catch was needed because BinaryTokenizer lacked an 'append()' method so we needed to rollback() before the makeshift reset(leftovers()+new_data). So I added append() and removed the try/catch.

rousskov · 2023-12-18T15:43:08Z

src/security/Handshake.cc

        currentContentType = record.type;
    } else {
-        tkMessages.reset(tkMessages.leftovers().append(record.fragment), true);
+        tkMessages.expectMore(expectMoreRecordBytes);
+        tkMessages.append(record.fragment);


I am not sure, but it looks like this else part can now be reused by the primary if case as well because that primary part asserts tkMessages.atEnd().

if (currentContentType != record.type) { ... Must(tkMessages.atEnd()); // no currentContentType leftovers currentContentType = record.type; // fall through to process new currentContentType bytes } tkMessages.expectMore(expectMoreMessageLayerBytes); tkMessages.append(record.fragment); parseMessages();

If the above works, you may be able to simplify code further (e.g., compute expectMoreMessageLayerBytes later, after the if statement, just before its use and possibly eliminate that particular variable completely).

P.S. parseMessages() should check done itself, so that the callers do not need to remember to check that precondition (and so that parseMessages() stop when it is time to stop).

Done.

that primary part asserts tkMessages.atEnd()

I have one concern: atEnd() does not check that everything has been committed (so that a subsequent rollback() would not mess things up). Though we should have called commit() by this time in message parsers - some new code may miss it.

rousskov · 2023-12-18T15:49:38Z

src/security/Handshake.cc

@@ -273,14 +273,16 @@ Security::HandshakeParser::parseModernRecord()
    // RFC 5246: MUST NOT send zero-length [non-application] fragments
    Must(record.fragment.length() || record.type == ContentType::ctApplicationData);

+    const auto expectMoreRecordBytes = !(tkRecords.atEnd() && !tkRecords.expectMore());


Just to clarify this logic:

Suggested change

const auto expectMoreRecordBytes = !(tkRecords.atEnd() && !tkRecords.expectMore());

const auto haveUnparsedRecordBytes = !tkRecords.atEnd();

const auto expectMoreRecordLayerBytes = tkRecords.expectMore();

const auto expectMoreMessageLayerBytes = haveUnparsedRecordBytes || expectMoreRecordLayerBytes;

Alternatively, we can add a BinaryTokenizer::exhausted() method:

Suggested change

const auto expectMoreRecordBytes = !(tkRecords.atEnd() && !tkRecords.expectMore());

const auto expectMoreMessageLayerBytes = !tkRecords.exhausted();

I applied the first suggestion and added the second one as a TODO (to add exhausted() if code becomes duplicated).

rousskov · 2023-12-18T15:59:28Z

src/security/Handshake.cc

-            skipMessage("unknown ContentType msg [fragment]");
+    tkMessages.rollback();
+
+    for (; !tkMessages.atEnd(); tkMessages.commit()) {


I suspect we should move this tkMessages.commit() call into individual parsers because (otherwise) their post-message-parsing exceptions may result in re-parsing of the same message. Note how Security::HandshakeParser::parseVersion2Record() and Security::HandshakeParser::parseModernRecord() use this approach for parsing TLS records already.

Also, we should stop parsing if we are done, right?

Suggested change

for (; !tkMessages.atEnd(); tkMessages.commit()) {

while (!tkMessages.atEnd() && !done) {

The above condition still looks wrong because we are supposed to throw InsufficientInput instead of remembering to check tkMessages.atEnd(), but we cannot easily remove that part of the condition because there may be more record input available to the caller and because some of our individual parsers may assume that they are never called again if they successfully parse their record type. We may need to come back to this (possibly to introduce "we must parse all or nothing" checks to some of those five message parsers), but let's make the above small step forward now.

we should move this tkMessages.commit() call into individual parsers

Done.

we should stop parsing if we are done

It looks so but this done (described as not nil if we got what we were looking for) does not have a clear scope, i.e., whether it marks the end of records, messages or both (after these changes it became 'both').

let's make the above small step

Done.

Alex: we should stop parsing if we are done

It looks so but this done (described as not nil if we got what we were looking for) does not have a clear scope, i.e., whether it marks the end of records, messages or both (after these changes it became 'both').

I agree that done is not related to record or message boundaries, but why "but"? In other words, why does that lack of relationship represent a problem?

FWIW, I do not expect this PR changes to alter done meaning -- we may be "done" at any arbitrary input point.

Imagine a scenario when we 'got what we were looking for' but there are still bytes in tkMessages which we should discard then. In this case, there would be a difference between the 'stop analyzing tkMessages immediately' and proceeding with the rest of tkMessages in parseMessages() (at least for debugging purposes, if these bytes are going to be thrown away anyway).

Imagine a scenario when we 'got what we were looking for' but there are still bytes in tkMessages which we should discard then.

I can easily imagine this scenario because our handshake parser is designed to parse until it gets the necessary information, not to parse the whole handshake. I would expect "leftovers" in tkRecords and/or tkMessages in many such cases. These leftovers do not represent a problem AFAICT.

If the handshake parser was designed to parse the entire handshake, we would not have the "done" field at all -- we would either successfully parse the whole thing, detect an error, or keep parsing.

Since Squid is not a part of that handshake in many cases, Squid will not be able to parse the encrypted parts.

In this case, there would be a difference between the 'stop analyzing tkMessages immediately' and proceeding with the rest of tkMessages in parseMessages()

Yes, setting "done" is meant to stop parsing tkMessages and tkRecords. I do not see a problem with that. IMO, defining "done" as "successfully parsed everything we wanted to parse" or "got what we were looking for" results in fairly "clear scope".

so that post-parsing exceptions would not cause reparsing of the already parsed messages.

eduard-bagdasaryan

Done (6932606).

eduard-bagdasaryan · 2023-12-19T12:48:04Z

src/security/Handshake.cc

@@ -273,14 +273,16 @@ Security::HandshakeParser::parseModernRecord()
    // RFC 5246: MUST NOT send zero-length [non-application] fragments
    Must(record.fragment.length() || record.type == ContentType::ctApplicationData);

+    const auto expectMoreRecordBytes = !(tkRecords.atEnd() && !tkRecords.expectMore());


I applied the first suggestion and added the second one as a TODO (to add exhausted() if code becomes duplicated).

eduard-bagdasaryan · 2023-12-19T14:41:52Z

src/security/Handshake.cc

        currentContentType = record.type;
    } else {
-        tkMessages.reset(tkMessages.leftovers().append(record.fragment), true);
+        tkMessages.expectMore(expectMoreRecordBytes);
+        tkMessages.append(record.fragment);


Done.

that primary part asserts tkMessages.atEnd()

I have one concern: atEnd() does not check that everything has been committed (so that a subsequent rollback() would not mess things up). Though we should have called commit() by this time in message parsers - some new code may miss it.

eduard-bagdasaryan · 2023-12-19T14:48:53Z

src/security/Handshake.cc

-            skipMessage("unknown ContentType msg [fragment]");
+    tkMessages.rollback();
+
+    for (; !tkMessages.atEnd(); tkMessages.commit()) {


we should move this tkMessages.commit() call into individual parsers

Done.

we should stop parsing if we are done

It looks so but this done (described as not nil if we got what we were looking for) does not have a clear scope, i.e., whether it marks the end of records, messages or both (after these changes it became 'both').

let's make the above small step

Done.

eduard-bagdasaryan · 2023-12-20T12:50:41Z

src/security/Handshake.cc

-    for (; !tkMessages.atEnd(); tkMessages.commit()) {
+    tkMessages.rollback();
+
+    while (!tkMessages.atEnd() && !done) {


We probably need to ensure somehow that tkMessages.commit() was called after each iteration.

Option 1: "Ensure that tkMessages.commit() was called" by parseMessageFoo()

I doubt that checking whether tkMessages.commit() was called is the best way forward because even if we, say, count the number of commits per iteration, and assert that there was at least one, we still would not protect ourselves from buggy code that commits the current message and then extracts/saves something from the next message (without committing).

Option 2: Move parsing to parseMessage()

The underlying/core problem here is that, with the current design, the loop code does not and, IMHO, should not know what happens inside each parseFoo() function it calls. There is an implicit assumption that each of those functions commits after parsing and before accumulation, but we cannot really "assert" that. To make this code safer, we could remove tkMessages manipulation from individual parseFooMessage() functions:

while (...) { switch (currentContentType) { case FooMessage: interpretFooMessage(parseMessage<FooMesasge>()); continue; case BarMessage: interpretBarMessage(parseMessage<BarMessage>()); continue; } skipMessage(...);

where parseMessage() is a templated method that parses a single message, commits tkMessages state, and returns the parsed message object.

This would require creation of a ChangeCipherSpecMessage class to parse ctChangeCipherSpec messages.

However, I am not sure the above restructuring is the best way forward. It feels like it would make the code slightly more complex while only offering a small safety improvement -- an interpretFooMessage() function can still violate our expectations, of course; it would be just easier to spot those violations. This option also raises performance questions related to copying of parsed messages, although most of those copies may be eluded by a C++17 compiler (I have not checked).

Option 3: Move tkMessages.commit() calls into the loop

We know that every parseFooMessage() function must parse a single message¹ (or throw). If it parses less or more than that, it is buggy. If it accumulates something before finishing parsing, it is buggy. This loop cannot do anything to detect those bugs. However, this loop can assume correct parseFooMessage() implementation and commit() after calling parseFooMessage() or skipMessage()! If parseFooMessage() is buggy, committing at the loop level will not make things worse AFAICT or, at the very least, those bugs will not be the loop's fault.

When a successful parsing attempt is expected/required to commit, then performing that commit at the loop level is better because it reduces the probability that a parseFooMessage() function forgets to commit (in some cases, especially when parseFooMessage() delegates parsing to another method or a chain of methods). Committing at the loop level clarifies the code structure/intent, especially when loop-level commit() calls are paired with loop-level rollback() calls.

If you agree that Option 3 is the best way forward, please implement Option 3 by moving commit() calls from parseFooMessage() functions into the loop. Please use the for (tkMessages.rollback();...; tkMessages.commit()) loop header structure.

Option 4: Individual message parsers should commit()

This option does not really "ensure that tkMessages.commit() was called after each iteration". It is discussed here because it contradicts Option 3.

This option is already implemented in this PR because I suggested to do so. My suggestion was based on the following logic:

Alex: I suspect we should move this tkMessages.commit() call into individual parsers because (otherwise) their post-message-parsing exceptions may result in re-parsing of the same message.

I think I was wrong because I missed the fact that relevant InsufficientInput exceptions cannot happen here. There are two kinds of exceptions we need to consider in this context:

InsufficientInput: If parseMessageFoo() throws InsufficientInput while parsing MessageFoo, then parseMessageFoo() does not call commit(), so this case is irrelevant. If parseMessageFoo() successfully extracts MessageFoo and calls commit(), then any attempt to parse any FooMessage component cannot result in InsufficientInput exception because the corresponding component tokenizer must have false expectingMore() state -- we have parsed the entire MessageFoo, and no more message bytes can be coming!

All other exceptions: If parseMessageFoo() throws any other exception, before or after commit(), the whole handshake parser should stop. There will be no retries.

Footnotes

A single message or an equivalent: We effectively treat fragments of unknown messages, fragments of application layer messages, and fragments of ChangeCipherSpec messages as messages because TLS does not give us the expected message size, and we do not want to accumulate those fragments (and add code to prevent excessive accumulation). ↩

agree that Option 3 is the best way forward

Yes, I think it is the best among other options that you outlined. Done (32c24e5).

rousskov · 2023-12-20T16:23:24Z

src/security/Handshake.cc

-    for (; !tkMessages.atEnd(); tkMessages.commit()) {
+    tkMessages.rollback();
+
+    while (!tkMessages.atEnd() && !done) {


Option 1: "Ensure that tkMessages.commit() was called" by parseMessageFoo()

I doubt that checking whether tkMessages.commit() was called is the best way forward because even if we, say, count the number of commits per iteration, and assert that there was at least one, we still would not protect ourselves from buggy code that commits the current message and then extracts/saves something from the next message (without committing).

Option 2: Move parsing to parseMessage()

The underlying/core problem here is that, with the current design, the loop code does not and, IMHO, should not know what happens inside each parseFoo() function it calls. There is an implicit assumption that each of those functions commits after parsing and before accumulation, but we cannot really "assert" that. To make this code safer, we could remove tkMessages manipulation from individual parseFooMessage() functions:

while (...) { switch (currentContentType) { case FooMessage: interpretFooMessage(parseMessage<FooMesasge>()); continue; case BarMessage: interpretBarMessage(parseMessage<BarMessage>()); continue; } skipMessage(...);

where parseMessage() is a templated method that parses a single message, commits tkMessages state, and returns the parsed message object.

This would require creation of a ChangeCipherSpecMessage class to parse ctChangeCipherSpec messages.

However, I am not sure the above restructuring is the best way forward. It feels like it would make the code slightly more complex while only offering a small safety improvement -- an interpretFooMessage() function can still violate our expectations, of course; it would be just easier to spot those violations. This option also raises performance questions related to copying of parsed messages, although most of those copies may be eluded by a C++17 compiler (I have not checked).

Option 3: Move tkMessages.commit() calls into the loop

We know that every parseFooMessage() function must parse a single message¹ (or throw). If it parses less or more than that, it is buggy. If it accumulates something before finishing parsing, it is buggy. This loop cannot do anything to detect those bugs. However, this loop can assume correct parseFooMessage() implementation and commit() after calling parseFooMessage() or skipMessage()! If parseFooMessage() is buggy, committing at the loop level will not make things worse AFAICT or, at the very least, those bugs will not be the loop's fault.

When a successful parsing attempt is expected/required to commit, then performing that commit at the loop level is better because it reduces the probability that a parseFooMessage() function forgets to commit (in some cases, especially when parseFooMessage() delegates parsing to another method or a chain of methods). Committing at the loop level clarifies the code structure/intent, especially when loop-level commit() calls are paired with loop-level rollback() calls.

If you agree that Option 3 is the best way forward, please implement Option 3 by moving commit() calls from parseFooMessage() functions into the loop. Please use the for (tkMessages.rollback();...; tkMessages.commit()) loop header structure.

Option 4: Individual message parsers should commit()

This option does not really "ensure that tkMessages.commit() was called after each iteration". It is discussed here because it contradicts Option 3.

This option is already implemented in this PR because I suggested to do so. My suggestion was based on the following logic:

Alex: I suspect we should move this tkMessages.commit() call into individual parsers because (otherwise) their post-message-parsing exceptions may result in re-parsing of the same message.

I think I was wrong because I missed the fact that relevant InsufficientInput exceptions cannot happen here. There are two kinds of exceptions we need to consider in this context:

InsufficientInput: If parseMessageFoo() throws InsufficientInput while parsing MessageFoo, then parseMessageFoo() does not call commit(), so this case is irrelevant. If parseMessageFoo() successfully extracts MessageFoo and calls commit(), then any attempt to parse any FooMessage component cannot result in InsufficientInput exception because the corresponding component tokenizer must have false expectingMore() state -- we have parsed the entire MessageFoo, and no more message bytes can be coming!

All other exceptions: If parseMessageFoo() throws any other exception, before or after commit(), the whole handshake parser should stop. There will be no retries.

Footnotes

A single message or an equivalent: We effectively treat fragments of unknown messages, fragments of application layer messages, and fragments of ChangeCipherSpec messages as messages because TLS does not give us the expected message size, and we do not want to accumulate those fragments (and add code to prevent excessive accumulation). ↩

src/security/Handshake.cc

rousskov · 2023-12-20T19:56:46Z

src/security/Handshake.cc

@@ -631,10 +639,11 @@ Security::HandshakeParser::parseSupportedVersionsExtension(const SBuf &extension
 void
 Security::HandshakeParser::skipMessage(const char *description)
 {
-    // tkMessages/fragments can only contain messages of the same ContentType.
+    // tkMessages can only contain messages of the same ContentType.
    // To skip a message, we can and should skip everything we have [left]. If
    // we have partial messages, debugging will mislead about their boundaries.


I think that "debugging will mislead" comment is stale because all skipMessage() callers add foo [fragment] to the message foo description. We are skipping message(s) and/or message fragments here.

Suggested change

// we have partial messages, debugging will mislead about their boundaries.

// we buffered a partial message, we will need to read/skip multiple times.

rousskov · 2023-12-20T19:59:34Z

src/parser/BinaryTokenizer.h

@@ -110,6 +113,11 @@ class BinaryTokenizer
    /// debugging helper for parsed multi-field structures
    void got(uint64_t size, const char *description) const;

+    /// whether more data bytes may arrive in the future
+    bool expectMore() const { return expectMore_; }


Suggested change

bool expectMore() const { return expectMore_; }

auto expectingMore() const { return expectMore_; }

undoing 6932606. That attempt was wrong, e.g., because we cannot reliably ensure commit() in each parser and that there are some exceptions that can cause reparsing (in fact, exceptions other from InsufficientInput will stop parsing for good and therefore are not a problem).

* Do not check for atEnd() in the tkMessages loop - this check looks like a design flaw as far as we have InsuffcientInput() for such cases. * Do not parse leftovers in parseModernRecord() when currentContentType changes. The unparsed leftovers mean that the previous record had an incomplete message. When a new record (with a new content type) comes, that incomplete message remains incomplete, since messages cannot span multiple records with different content types. Before this change, this scenario would hit the Must(expectMore_) in BinaryTokenizer::want(). Now it will trigger Must(tkMessages.atEnd()) with the same effect to the caller code. * Changed HandshakeParser::parseHello() so that it could properly handle InsufficientInput() in both cases: 1. tkMessages - attempt to parse the next record. 2. tkRecords - return from the method (and the caller attempts to read the next record then).

eduard-bagdasaryan

Done (up to 9a96bc7).

src/security/Handshake.cc

eduard-bagdasaryan · 2023-12-26T23:32:30Z

src/security/Handshake.cc

@@ -631,10 +639,11 @@ Security::HandshakeParser::parseSupportedVersionsExtension(const SBuf &extension
 void
 Security::HandshakeParser::skipMessage(const char *description)
 {
-    // tkMessages/fragments can only contain messages of the same ContentType.
+    // tkMessages can only contain messages of the same ContentType.
    // To skip a message, we can and should skip everything we have [left]. If
    // we have partial messages, debugging will mislead about their boundaries.


eduard-bagdasaryan · 2023-12-26T23:32:40Z

src/parser/BinaryTokenizer.h

@@ -110,6 +113,11 @@ class BinaryTokenizer
    /// debugging helper for parsed multi-field structures
    void got(uint64_t size, const char *description) const;

+    /// whether more data bytes may arrive in the future
+    bool expectMore() const { return expectMore_; }


src/security/Handshake.cc

rousskov · 2023-12-27T14:34:34Z

src/parser/BinaryTokenizer.h

+    /// whether more data bytes may arrive in the future
+    auto expectingMore() const { return expectMore_; }
+    /// allow or prohibit arriving more data bytes in the future
+    void expectingMore(bool val) { expectMore_ = val; }


Sorry, in my earlier change request, I did not try to imply that this method should be renamed as well: The testing method should be spelled expectingMore() -- we are inspecting what the object is currently doing (the corresponding private data member should have been named the same but we are not going to rename it in this PR). The setting method should be spelled expectMore() -- we are telling the object what to do in the future.

Suggested change

void expectingMore(bool val) { expectMore_ = val; }

void expectMore(const bool em) { expectMore_ = em; }

rousskov · 2023-12-27T15:20:54Z

src/security/Handshake.cc

-    }
-    catch (const Parser::BinaryTokenizer::InsufficientInput &) {
-        debugs(83, 5, "need more data");
-        return false;


This "return false" for InsufficientInput case was correct, and the caller relies on it AFAICT: We have to keep it or move InsufficientInput processing to the caller. The added atEnd() check above is another red flag. Overall, I am not sure that this method refactoring improved things, but perhaps I am missing some essential changes in this diff noise.

Unfortunately, these changes failed one of my per-byte test - the tkRecords.atEnd() check does not work when tkRecords have some unparsed bytes insufficient for the next record. We need to know somehow who threw InsufficientInput() in the catch() below - and, depending on that, either return false (insufficient tkRecords) or continue to the next record (insufficient tkMessages). I tried to add some synced() method (whether all parsed bytes were comitted) into BinaryTokenizer and calculate this state as (!synced() || atEnd()) for tkMessages. But it looks a bit complicated/unreliable - we should not calculate the state again with the exception at hand. Unfortunately, InsufficientInput() does not have any attribute about its initiator.

either return false (insufficient tkRecords) or continue to the next record (insufficient tkMessages). ... Unfortunately, InsufficientInput() does not have any attribute about its initiator.

To treat InsufficientInput emitted by message layer differently, catch InsufficientInput emitted by the that layer at that layer (i.e. when we parse messages). InsufficientInput caught while making parseRecord() calls will be specific to the records layer. AFAICT, InsufficientInput emitted while parsing messages should be "caught, reported, and ignored" -- the control will then bubble up to the next parseRecord() loop iteration.

Alternatively, we could parse all records we have read before parsing the messages in the accumulated fragments. If message layer always sees all available same-type records, then InsufficientInput thrown by any layer will have the same meaning -- need to read more bytes (record bytes and message bytes). However, it is not clear to me whether such refactoring will make the code better or worse overall. Please try to fix the current (i.e. "parse one record fragment at a time") code before we attempt such a significant refactoring.

InsufficientInput emitted by the that layer at that layer

Ok. I hesitated doing this because recently we rejected tkMessages try/catch for a reason.

* Throw a meaningful exception in a case of a truncated TLS record (instead of a Must()) * Process InsufficientInput() from tkMessages in its own try/catch

eduard-bagdasaryan

Done (e304647).

eduard-bagdasaryan · 2023-12-28T13:04:41Z

src/parser/BinaryTokenizer.h

+    /// whether more data bytes may arrive in the future
+    auto expectingMore() const { return expectMore_; }
+    /// allow or prohibit arriving more data bytes in the future
+    void expectingMore(bool val) { expectMore_ = val; }


src/security/Handshake.cc

eduard-bagdasaryan · 2023-12-28T14:11:33Z

src/security/Handshake.cc

-    }
-    catch (const Parser::BinaryTokenizer::InsufficientInput &) {
-        debugs(83, 5, "need more data");
-        return false;


InsufficientInput emitted by the that layer at that layer

Ok. I hesitated doing this because recently we rejected tkMessages try/catch for a reason.

thus protecting callers from forgetting to call this method.

rousskov · 2024-01-02T15:04:56Z

src/security/Handshake.cc

+    if (tkMessages.atEnd())
+        throw Parser::InsufficientInput();


A TLS record may have a zero-length fragment (e.g., RFC 8446 says that "Zero-length fragments of Application Data MAY be sent"). Thus, a valid TLS message may have zero length¹, and our code must handle zero-length TLS fragments and empty messages (without getting stuck, etc.).

This "at least one byte" check can be reasonably interpreted as effectively using the opposite invariant -- that every message has some bytes. PR's current parseMessages() loop also effectively assumes that successfully parsing any message allows us to make progress which is not technically true for empty messages.

Moreover, if tkMessages is no longer expectingMore() and is atEnd(), then it may be wrong to throw InsufficientInput here because our input was sufficient to parse existing tkMessages -- more tkMessages bytes (with the same record type) are not expected and adding those bytes would lead to a parsing failure). Yet, parseMessages() loop should stop.

To find the right solution, we can try to (re)define what parseMessages() means. Once we have that definition, and assuming we are not going to add more special exceptions, we will through InsufficientInput when tkMessages do not have enough bytes to satisfy the defined parseMessages() goal.

After rejecting many candidates for various reasons, I can suggest only two reasonable parseMessages() definitions:

parseMessages() incrementally parses all sequential currentContentType TLS fragments.

"Optimization": Same as the above, but the function may stop earlier if done becomes truthy².

Official code may be seen as trying to implement the first bullet (which ignores done state while we are parsing tkMessages), but official code has several bugs (that this PR is trying to fix). For example, official code does not throw InsufficientInput when there are not enough bytes to parse ChangeCipherSpec. Official code also does not throw InsufficientInput when there are no bytes to parse the second (not yet received) sequential application data fragment after parsing the first such fragment. In summary, official code does not implement the first bullet design.

The current PR parseMessages() quits parsing tkMessages when done becomes truthy. However,
current PR code does not stop parsing after successfully parsing all sequential currentContentType TLS fragments! It keeps going (if done is still nil). This is probably a PR bug -- I failed to find a reasonable definition for parseMessages() that would fit the current PR code.

AFAICT, to implement the above parseMessages() definition(s), we have to do the following:

Remove this check.

Replace skipMessage() call in parseChangeCipherCpecMessage() with proper ChangeCipherSpec parsing code so that we do not skip multiple ChangeCipherSpecs in that method (as if it is one spec) and simplify skipMessage() semantics by focusing skipMessage() on two other use cases/callers.

Refactor parseMessages() to make two different kinds of message parsing explicit:

/// Incrementally parses all sequential currentContentType TLS fragments. /// Successfully stops parsing earlier if `done` becomes set. void Security::HandshakeParser::parseMessages() { switch (currentContentType) { case ContentType::ctChangeCipherSpec: return parseNonEmptyMessages(parseChangeCipherSpecMessage); case ContentType::ctAlert: return parseNonEmptyMessages(parseAlertMessage); case ContentType::ctHandshake: return parseNonEmptyMessages(parseHandshakeMessage); case ContentType::ctApplicationData: return skipPossiblyEmptyMessages("app data [fragment]"); } return skipPossiblyEmptyMessages("unknown ContentType msg [fragment]"); } /// Incrementally parses all sequential currentContentType messages using the given TLS message parser. /// Each message is assumed to be serialized using at least one byte. /// At least one message is expected per sequence. /// Successfully stops parsing earlier if `done` becomes set. void Security::HandshakeParser::parseNonEmptyMessages(... messageParser) { // XXX: We should not silently ignore empty message sequences for (tkMessages.rollback(); !done && !tkMessages.exhausted(); tkMessages.commit()) messageParser(); }

The existing rollback/commit loop will go inside the new parseNonEmptyMessages() method. AFAICT, to match the new definition, that loop should run until tkMessages are exhausted (i.e. atEnd() and not expectingMore()) unless done becomes truthy earlier (the second bullet "optimization").

In the above sketch, skipPossiblyEmptyMessages() is a skipMessage() replacement that does not need a loop. It should throw InsufficientInput unless tkMessages are exhausted. I am not sure whether it needs rollback() and/or commit().

The above XXX in sketched parseNonEmptyMessages() will need to be addressed as well, which will probably require more refactoring, but it may be easier to do later, as a separate commit/step.

Footnotes

The same conclusion is also true for invalid TLS messages (i.e., messages that violate some TLS MUSTs by being empty) AFAICT. If only invalid messages could have zero length, then we could, perhaps, get away with throwing InsufficientInput excpetions when seeing such messages. Since valid messages may have zero length, we cannot claim insufficient input when parsing them. ↩

This is not just a performance optimization: By stopping earlier, we prevent triggering Must() violations and similar problems in rare cases where TLS input is malformed or unsupported (but sufficient for Squid to see whatever it wanted to see, making done truthy). ↩

I tried to implement this approach (up to 47eb627).

The above XXX

I added this XXX for now but I am not sure how 'empty message sequences' are possible here.
HandshakeParser::parseModernRecord() guarantees that tkMessages will get at least one byte here (due to the protection 'MUST NOT send zero-length [non-application] fragments') - i.e., exhausted() should be false. Also, 'done' should have been checked earlier in parseHello().

I added this XXX for now but I am not sure how 'empty message sequences' are possible here. HandshakeParser::parseModernRecord() guarantees that tkMessages will get at least one byte here (due to the protection 'MUST NOT send zero-length [non-application] fragments')

Do you mean that we are "protected" by a "MUST NOT" text in an RFC? Surely an attacker can violate that MUST -- we cannot rely on that kind of protection or guarantees in a parser... AFAICT, the sketched code will just ignore/skip such invalid messages (i.e. there are no infinite loops or similar dangers here). This is not the end of the world, but to the parseNonEmptyMessages() caller it would look like a non-empty message was successfully parsed and that feels wrong, especially when some future refactoring might unknowingly create an infinite loop when handling such an invalid input (because all the method name suggests that this code should not worry about empty messages).

Also, 'done' should have been checked earlier in parseHello().

Yes, but that messageParser() call inside the loop can make done truthy, so we have to check on every iteration AFAICT.

Do you mean that we are "protected" by a "MUST NOT" text in an RFC?

No, I meant the protection in HandshakeParser::parseModernRecord():

Must(record.fragment.length() || record.type == ContentType::ctApplicationData);

With this protection, only ctApplicationData can be empty, which are handled by skipPossiblyEmptyMessages().

No, I meant [Must(record.fragment.length()...)] in HandshakeParser::parseModernRecord

Ah, I see. I agree that the above Must() protects parseNonEmptyMessages() from empty input.

However, it is difficult to assert that protection using the current code structure. I think we should use that protection to avoid the XXX without introducing assertions. We are expecting at least one byte, so we should parse input before checking whether it is exhausted:

tkMessages.rollback(); do { parseMessage(); tkMessages.commit(); } while (!done && !tkMessages.exhausted());

Ok, adjusted at 4b525e4.

rousskov · 2024-01-02T15:13:38Z

src/security/Handshake.cc

-        currentContentType = record.type;
-    } else {
-        fragments.append(record.fragment);
+    if (currentContentType != record.type && !tkMessages.atEnd()) {


Checking tkMessages.atEnd() before calling tkMessages.rollback() feels wrong because rollback() may change the value of atEnd().

Fixed (47eb627).

src/security/Handshake.cc

distinguishing 'empty' and 'non-empty' messages.

eduard-bagdasaryan · 2024-01-05T22:29:18Z

This change also limits accumulation of (specially crafted) TLS fragments.

To clarify: Before this change, such fragments would accumulate in an SBuf buffer and parseMessages() is not called at all while accumulating. With this change - the fragments are still accumulated in tkMessages (by calling append()), but parseMessages() is called after getting a fragment(s). Do we need really to 'limit' such accumulation in tkMessages (e.g., by introducing a maximum number of accumulated tkMessages bytes?)

Instead, I added a tokenizer method to check whether there are 'uncommitted' bytes.

rousskov · 2024-01-06T00:03:18Z

This change also limits accumulation of (specially crafted) TLS fragments.

To clarify: Before this change, such fragments would accumulate in an SBuf buffer and parseMessages() is not called at all while accumulating. With this change - the fragments are still accumulated in tkMessages (by calling append()), but parseMessages() is called after getting a fragment(s).

Yes, I am aware. We are not saying that we stopped accumulation, only that we now (properly) limiting it. Limited accumulation is fine, of course. Unlimited accumulation is an attack target.

Do we need really to 'limit' such accumulation in tkMessages

It is a good question. When I was writing the PR description text above, I had assumed that this PR introduced a "natural" limit -- we should not accumulate more than, roughly speaking, one (maximum-size) parsed TLS message plus one (maximum-size) TLS fragment. I still hope that assumption is correct, but I now wonder that the maximum size of a (parsed by Squid) TLS message is.

Please research that: Imagine that the message sender is trying to maximize every message field, including maximizing the number of elements in various lists (e.g. extensions) and individual field/element sizes. How big will that maximum-size message be? If the maximum TLS message size is more than, say, 10 megabytes, then we should artificially limit this accumulation as well...

eduard-bagdasaryan · 2024-01-06T18:11:29Z

research that: Imagine that the message sender is trying to maximize every message field,

I performed a simple experiment, sending a lot of ctApplicationData fragments between ctHandshake and ctChangeCipherSpec. As I had expected, tkMessages.parsed() grew up to that value (~10M in my test). Note that this scenario has nothing to do with any TLS message limitations in Squid because ctApplicationData are not_parsed/ignored - increasing only tkMessages buffer.

we should artificially limit this accumulation as well...

It looks like we should do this (hardcoding the arbitrary 10M limitation and aborting the handshake?)

Though currently the caller Musts() that there is a non-empty record fragment before calling this method, it should check this invariant itself. We could add an assertion there, but it is better to delegate this check to specific parsers (expecting an non-empty input).

rousskov · 2024-01-06T23:11:35Z

research that: Imagine that the message sender is trying to maximize every message field,

I performed a simple experiment, sending a lot of ctApplicationData fragments between ctHandshake and ctChangeCipherSpec. As I had expected, tkMessages.parsed() grew up to that value (~10M in my test).

Ha, this is a different attack than the one I was thinking about! Both attacks are of a concern:

Accumulation of unparsed Tokenizer bytes (e.g., a read-append-rollback-parse-commit loop that keeps looping because a tk.area(100GB) call inside the loop keeps throwing InsufficientInput to request more and more bytes to finish parsing a single 100GB field while no bytes are being consume()d from Tokenizer). This is the attack I was thinking about when asking you to research its potential/impact on TLS handshake parsing code. How big can a single parsed (i.e. not skipped) TLS message get?
Accumulation of parsed Tokenizer bytes (e.g., a read-append-rollback-parse-commit loop that keeps looping because tk.area(10Bytes) call inside the loop keeps consuming parsed field bytes but then throwing InsufficientInput to request more and more bytes to finish parsing an infinite sequence of 10-byte fields). This is the attack you have tested.

Both attacks grow the same BinaryTokenizer::data_ field. The difference is in whether BinaryTokenizer::parsed_ grows (it does not in the first attack).

It looks like we should do this (hardcoding the arbitrary 10M limitation and aborting the handshake?)

I am not sure yet because I am not sure whether the first attack is viable in TLS handshake parsing context (that is why I asked you to research it). For the second attack, the best solution may be in adjusting BinaryTokenizer to automatically forget committed data. To reduce memmove() overheads inside SBuf, BinaryTokenizer can forget when:

commit() happens after we parse all bytes accumulated so far -- there are no unparsed leftovers to memmove(), and, hence, BinaryTokenizer::data_ can be reset cheaply.
commit() happens after we accumulate at least, say, 1MB of parsed bytes -- memmove() may be necessary in this case, but we have parsed enough to amortize its cost

The above BinaryTokenizer adjustment probably deserves a dedicated minimal PR, but let's research whether TLS handshake parsing code should also protect itself from the first attack before finalizing the scope of that minimal PR.

eduard-bagdasaryan · 2024-01-07T11:31:58Z

How big can a single parsed (i.e. not skipped) TLS message get?

Each of TLS message (one of HandshakeType, such as hskClientHello, hskServerHello, etc) has a length field occupying 3 bytes, i.e., the message can be theoretically up to ~16M. However, these messages are limited by the sum of their field sizes, for example, hskServerHello consists of extensions field (up to 64K) plus several fields, 65607 bytes maximum. ClientHello message has a similar limitation. I think the 16M maximum can be reached with hskCertificate messages (e.g., a certificate with a huge amount of extensions). Squid does not analyze these messages but it parses them anyway into Handshake::msg_body: msg_body = tk.pstring24(".msg_body");.

eduard-bagdasaryan · 2024-01-08T22:52:16Z

performed a simple experiment, sending a lot of ctApplicationData fragments

In this experiment, I sent a ~10M block of ~100 byte ctApplicationData messages. The test took quite a long time - even more than default connect_timeout allowed (1 min), so I had to increase it. Logs showed that Squids eventually slows down as it processes the next portion of data, e.g:

2024/01/09 01:40:08.244| 24,7| BinaryTokenizer.cc(59) got: TLSPlaintext occupying 96 bytes @4326624 in 0x55806a534990;
2024/01/09 01:40:08.244| 24,7| SBuf.cc(160) rawSpace: reserving 91 for SBuf5698
2024/01/09 01:40:08.244| 24,8| SBuf.cc(880) cow: SBuf5698 new size:4101370
2024/01/09 01:40:08.244| 24,7| MemBlob.cc(130) syncSize: 4101279 was: 4101279
2024/01/09 01:40:08.244| 24,8| SBuf.cc(847) reAlloc: SBuf5698 new size: 4101370
2024/01/09 01:40:08.244| 24,9| MemBlob.cc(56) MemBlob: constructed, this=0x55806a51dd70 id=blob46001 reserveSize=4101370
2024/01/09 01:40:08.245| 24,8| MemBlob.cc(101) memAlloc: blob46001 memAlloc: requested=4101370, received=4101370
2024/01/09 01:40:08.247| 24,9| MemBlob.cc(82) ~MemBlob: destructed, this=0x55806a544c30 id=blob46000 capacity=4101279 size=4101279
2024/01/09 01:40:08.247| 24,7| SBuf.cc(859) reAlloc: SBuf5698 new store capacity: 4101370
2024/01/09 01:40:08.247| 24,8| SBuf.cc(38) SBuf: SBuf140992 created from id SBuf5698
2024/01/09 01:40:08.247| 24,7| BinaryTokenizer.cc(92) skipped: app data [fragment] occupying 91 bytes @4101279 in 0x55806a5349c8;

So this 96-byte processing (when buf size grew to ~4M) takes ~3ms, while a similar block at the beginning runs in less than 1ms.

rousskov · 2024-01-09T03:09:52Z

2024/01/09 01:40:08.244| 24,9| MemBlob.cc(56) MemBlob: constructed, this=0x55806a51dd70 id=blob46001 reserveSize=4101370
2024/01/09 01:40:08.245| 24,8| MemBlob.cc(101) memAlloc: blob46001 memAlloc: requested=4101370, received=4101370
2024/01/09 01:40:08.247| 24,9| MemBlob.cc(82) ~MemBlob: destructed, this=0x55806a544c30 id=blob46000 capacity=4101279 size=4101279
2024/01/09 01:40:08.247| 24,7| SBuf.cc(859) reAlloc: SBuf5698 new store capacity: 4101370

Interesting. It looks like we are spending most of those 3ms to receive new memory from malloc() or to copy the contents of our old buffer into the new memory blob. Please try to isolate the primary consumer(s). Adding some temporary debugging and either temporary increasing debugs() timestamp precision or temporary using Stopwatch to measure delays with sub-millisecond precision may help with that.

eduard-bagdasaryan · 2024-01-09T13:26:37Z

Here is the debugging with increased precision:

2024/01/09 16:15:56.419865| 24,7| BinaryTokenizer.cc(59) got: TLSPlaintext occupying 96 bytes @4063488 in 0x559f241ac990;
2024/01/09 16:15:56.419884| 24,7| SBuf.cc(160) rawSpace: reserving 91 for SBuf5698
2024/01/09 16:15:56.419901| 24,8| SBuf.cc(880) cow: SBuf5698 new size:3851939
2024/01/09 16:15:56.419918| 24,7| MemBlob.cc(134) syncSize: 3851848 was: 3851848
2024/01/09 16:15:56.419936| 24,8| SBuf.cc(847) reAlloc: SBuf5698 new size: 3851939
2024/01/09 16:15:56.419971| 24,9| MemBlob.cc(56) MemBlob: constructed, this=0x559f241bcc30 id=blob43256 reserveSize=3851939
2024/01/09 16:15:56.420122| 24,8| MemBlob.cc(103) memAlloc: blob43256 memAlloc: requested=3851939, received=3851939
2024/01/09 16:15:56.420150| 24,7| MemBlob.cc(123) append: before memmove
2024/01/09 16:15:56.423803| 24,7| MemBlob.cc(125) append: after memmove
2024/01/09 16:15:56.423875| 24,9| MemBlob.cc(74) ~MemBlob: destructor started
2024/01/09 16:15:56.423905| 24,9| MemBlob.cc(84) ~MemBlob: destructed, this=0x559f24195d70 id=blob43255 capacity=3851848 size=3851848
2024/01/09 16:15:56.423923| 24,7| SBuf.cc(859) reAlloc: SBuf5698 new store capacity: 3851939

I also added a couple of debugging lines. So it is memmove() in MemBlob::append() that consumes these 3ms.

rousskov and others added 5 commits April 17, 2023 23:28

Bug 5266: Do not parse the same TLS record fragments twice

75f04a1

Reworked the initial solution introduced at 350f3a6, addressing both problems outlined there.

Adjusted the solution after testing and removed stale code

6ef59b6

Undid an out-of-scope change

f3650cf

Undone an out-of-scope change

095601c

rousskov requested changes Dec 11, 2023

View reviewed changes

Simplified, removing try/catch from HandshakeParser::parseMessages()

328b704

also calculate the 'expectMoreRecordBytes' state instead of assuming that it is always 'true'.

eduard-bagdasaryan commented Dec 18, 2023

View reviewed changes

rousskov requested changes Dec 18, 2023

View reviewed changes

Moved tkMessages.commit() into individual parsers

6932606

so that post-parsing exceptions would not cause reparsing of the already parsed messages.

eduard-bagdasaryan commented Dec 19, 2023

View reviewed changes

eduard-bagdasaryan commented Dec 20, 2023

View reviewed changes

rousskov requested changes Dec 20, 2023

View reviewed changes

eduard-bagdasaryan added 4 commits December 21, 2023 16:10

Do not skip 'empty' message fragments

2180c95

Polished method names and a description.

9a96bc7

eduard-bagdasaryan commented Dec 26, 2023

View reviewed changes

eduard-bagdasaryan added 2 commits December 27, 2023 02:35

Polished with 'auto'.

eba1b8e

Removed a typo

5d4b9b5

rousskov requested changes Dec 27, 2023

View reviewed changes

A couple of fixes

e304647

* Throw a meaningful exception in a case of a truncated TLS record (instead of a Must()) * Process InsufficientInput() from tkMessages in its own try/catch

eduard-bagdasaryan commented Dec 28, 2023

View reviewed changes

eduard-bagdasaryan added 2 commits December 29, 2023 16:48

Added a debugs() and simplified parseModernRecord()

3a88001

Call tkRecords.commit() calls from parsers to parseHello()

71802a6

thus protecting callers from forgetting to call this method.

rousskov requested changes Jan 2, 2024

View reviewed changes

eduard-bagdasaryan added 2 commits January 4, 2024 18:00

Refactored HandshakeParser::parseMessages()

dbf2481

distinguishing 'empty' and 'non-empty' messages.

Simplified, removing ChangeCipherSpec class

6b12750

Do not call tkMessages.atEnd() before rollback()

47eb627

Instead, I added a tokenizer method to check whether there are 'uncommitted' bytes.

-    const auto expectMoreRecordBytes = !(tkRecords.atEnd() && !tkRecords.expectMore());
+    const auto haveUnparsedRecordBytes = !tkRecords.atEnd();
+    const auto expectMoreRecordLayerBytes = tkRecords.expectMore();
+    const auto expectMoreMessageLayerBytes = haveUnparsedRecordBytes || expectMoreRecordLayerBytes;

	const auto expectMoreRecordBytes = !(tkRecords.atEnd() && !tkRecords.expectMore());
	const auto expectMoreMessageLayerBytes = !tkRecords.exhausted();

	for (; !tkMessages.atEnd(); tkMessages.commit()) {
	while (!tkMessages.atEnd() && !done) {

	// we have partial messages, debugging will mislead about their boundaries.
	// we buffered a partial message, we will need to read/skip multiple times.

	bool expectMore() const { return expectMore_; }
	auto expectingMore() const { return expectMore_; }

	void expectingMore(bool val) { expectMore_ = val; }
	void expectMore(const bool em) { expectMore_ = em; }

Bug 5266: Do not parse the same TLS record fragments twice #240

Are you sure you want to change the base?

Bug 5266: Do not parse the same TLS record fragments twice #240

Conversation

eduard-bagdasaryan commented Dec 4, 2023 • edited by rousskov Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eduard-bagdasaryan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rousskov Dec 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rousskov Dec 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eduard-bagdasaryan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Option 1: "Ensure that tkMessages.commit() was called" by parseMessageFoo()

Option 2: Move parsing to parseMessage()

Option 3: Move tkMessages.commit() calls into the loop

Option 4: Individual message parsers should commit()

Footnotes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Option 1: "Ensure that tkMessages.commit() was called" by parseMessageFoo()

Option 2: Move parsing to parseMessage()

Option 3: Move tkMessages.commit() calls into the loop

Option 4: Individual message parsers should commit()

Footnotes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eduard-bagdasaryan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eduard-bagdasaryan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Footnotes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eduard-bagdasaryan commented Jan 5, 2024

eduard-bagdasaryan commented Dec 4, 2023 •

edited by rousskov

Loading

rousskov Dec 18, 2023 •

edited

Loading

rousskov Dec 19, 2023 •

edited

Loading

rousskov commented Jan 6, 2024 •

edited

Loading

eduard-bagdasaryan commented Jan 6, 2024 •

edited

Loading