-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making BHTTPSerializer compliant to known-length messages #26
base: main
Are you sure you want to change the base?
Conversation
Motivation: As defined in [RFC9292](https://www.rfc-editor.org/rfc/rfc9292), you can choose to use either the known-length format or an indeterminate-length format. The purpose of this PR is to give the possibility for users to use the known-length format. Modifications: - Finite State Machine (FSM): Added an FSM to the serializer to define allowed state transitions. This prevents invalid sequences, such as having a Body HTTP Part followed by a Header HTTP Part. - BHTTPSerializer: Introduced a flag within BHTTPSerializer to determine whether the known-length or indeterminate-length format will be used. - Serialization of Known-Length Sections: Added methods to BHTTPSerializer specifically for serializing known-length sections. - Unit-tests: I updated unit tests to cover scenarios where we have out-of-order HTTP parts but also to verify the known-length implementation. Design Choices: - Buffer Reference: The output buffer is kept as a reference to optimize performance. Copying the entire buffer at the end of the serialization would be inefficient. - Inlining serializeContentChunk: I inlined the serializeContentChunk function to minimize overhead from function prologues and epilogues, as it is expected to be used frequently. - Buffers for Known-Length Format: For known-length formats, two buffers were introduced: chunkBuffer and fieldSectionBuffer. These buffers are used when the request/response consists of multiple parts. I did some researchs to find what could be the optimal initial capacities for those 2 buffers. I found out that I could set them to 500 and 700, respectively, based on data from the following sources: - SPDY Whitepaper for header field sizes: https://www.chromium.org/spdy/spdy-whitepaper - Arxiv paper on HTTP body size for body content size: https://arxiv.org/pdf/1405.2330 (If needed, I can provide more details on how these values were derived.) However, as underlined in the ByteBuffer implementation: https://github.com/apple/swift-nio/blob/main/Sources/NIOCore/ByteBuffer-core.swift, the initial capacity is set to: let newCapacity = capacity == 0 ? 0 : capacity.nextPowerOf2ClampedToMax() . Therefore, I decided to not use those values as it would very likely to be way too much. Given this, I decided not to use the predefined values, as they would likely be too large. Instead, I opted to rely on the initial size encountered when initializing the two buffers. - FSM Transition Definitions: The state transition definitions in the FSM are declared as static, with the intention that these definitions will be placed in the data section of the compiled binary. This should save memory but as I only used Swift for a couple of days now, I prefer to take distance. I tried to verify this by reviewing the assembly output here: https://godbolt.org/z/s88Kaxqn8 but I wasn't able to confirm with certainty due to the noise in the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start, thanks for this! I've left a few notes in the diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I think we're doing well here, I've left a few notes in the diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, so far this is looking really nice. I've left a few stylistic notes in the diff, but it really isn't much.
Thank @Lukasa ! I truly appreciate it. If you have any other feedback, feel free to share, I’m really looking for ways to get better ! |
Motivation:
As defined in RFC9292, you can choose to use either the known-length format or an indeterminate-length format. The purpose of this PR is to give the possibility for users to use the known-length format.
Modifications:
Design Choices:
(If needed, I can provide more details on how these values were derived.)
However, as underlined in the ByteBuffer implementation: https://github.com/apple/swift-nio/blob/main/Sources/NIOCore/ByteBuffer-core.swift, the initial capacity is set to:
let newCapacity = capacity == 0 ? 0 : capacity.nextPowerOf2ClampedToMax()
. Therefore, I decided to not use those values as it would very likely to be way too much. Given this, I opted to rely on the first size encountered when initializing the two buffers.