Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readv + MRG_RXBUF #4813

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ShadowCurse
Copy link
Contributor

Changes

Switch virtio-net device to use readv to read from a tap device and implement MRG_RXBUF flag.

Reason

Improve performance of the network device.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this
    PR.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.
  • Commits meet
    contribution quality standards.

  • This functionality cannot be added in rust-vmm.

@ShadowCurse ShadowCurse self-assigned this Sep 19, 2024
@ShadowCurse ShadowCurse force-pushed the readv_mrg_rx_buf branch 3 times, most recently from 5662f31 to bd61dd5 Compare September 20, 2024 10:23
ShadowCurse and others added 5 commits September 20, 2024 12:25
Add new `IovRingBuffer` ring buffer type that is tailored for holding
`struct iovec` objects that point to guest memory for IO. The
`struct iovec` objects represent the memory that the guest passed to
us as `Descriptors` in a VirtIO queue for performing some I/O operation.

We plan to use this type to describe the guest memory we have available
for doing network RX. This should facilitate us in optimizing the
reception of data from the TAP device using `readv`, thus avoiding a
memory copy.

Co-authored-by: Babis Chalios <[email protected]>
Signed-off-by: Egor Lazarchuk <[email protected]>
Add generic fixed size ring buffer type. The main
difference from a standard `VecDequeue` is the use of
`u32` as indexes. This shrinks the size of the ring buffer
from 32 bytes for `VecDequeue` to 24 bytes for `RingBuffer`.

Signed-off-by: Egor Lazarchuk <[email protected]>
Split `add_used` internals into 2 new functions: `write_used_element`
and `advance_used_ring`. This will be used in next commits to optimize
RX path of net device.

Signed-off-by: Egor Lazarchuk <[email protected]>
Right now, we are performing two copies for writing a frame from the TAP
device into guest memory:
- read the frame into an internal array held by the Net device
- copy that array into a buffers of a DescriptorChain.

In order to avoid this double copy we can use the readv system call to
read directly from the TAP device into the buffers described by
DescriptorChain.

The main challenge with this is that DescriptorChain objects describe
memory that is at least 65562 bytes long when guest TSO4, TSO6 or UFO
are enabled or 1526 otherwise and parsing the chain includes overhead
which we pay even if the frame we are receiving is much smaller than
these sizes.

PR firecracker-microvm#4748 reduced
the overheads involved with parsing DescriptorChain objects. To further
avoid this overhead, move the parsing of DescriptorChain objects out of
the hot path of process_rx() where we are actually receiving a frame
into process_rx_queue_event() where we get the notification that the
guest added new buffers for network RX.

Co-authored-by: Babis Chalios <[email protected]>
Signed-off-by: Egor Lazarchuk <[email protected]>
Now virtio-net device VIRTIO_NET_F_MRG_RXBUF feature which allows it
to write single packet into multiple descriptor chains.
The amount of descriptor chains (also known as heads) is written into
the `virtio_net_hdr_v1` structure which is located at the very begging
of the packet.

Signed-off-by: Egor Lazarchuk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant