Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid reallocation for String body extraction #2857

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

yhx-12243
Copy link

The conversion from Bytes to String does not need an extra allocation if we use Bytes: Into<Vec<u8>> and String::from_utf8.

@jplatte
Copy link
Member

jplatte commented Jul 31, 2024

Looks like this requires us to raise our minimum version of bytes, can you bump it to 1.6.1 in axum-core/Cargo.toml?

let string = std::str::from_utf8(&bytes)
.map_err(InvalidUtf8::from_err)?
.to_owned();
let string = String::from_utf8(bytes.into()).map_err(InvalidUtf8::from_err)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't the bytes.into() still reallocate the data? Unless there is some special code in the vtable of whatever representation of Bytes that underlies this that checks the reference count and avoids a copy if it is 1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also briefly looked at this, and couldn't really figure it out. My feeling overall is that while it may not actually be more performant in some or all cases, clearly the bytes authors specifically designed that impl to use the best possible implementation per specific Bytes representation so it couldn't really hurt, and the new code in this PR is shorter than the previous too.

Of course, a benchmark that shows that the new implementation is faster would be best.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I remember when I looked at bytes, you have two ways Bytes may be represented. Either with an owned Vec or if there are at least two instances referencing the same allocation (even if disjunct), it gets promoted to an Arc-backed storage.

But I think the bytes will already be cloned when the request body is collected so I think it should prevent one clone from happening.

@yhx-12243
Copy link
Author

yhx-12243 commented Jul 31, 2024

Looks like this requires us to raise our minimum version of bytes, can you bump it to 1.6.1 in axum-core/Cargo.toml?

Sorry I see, it requires a minimum version of 1.2 of bytes, it may indeed have some impact on those who used <1.2.

1.2 is the minimum version that `Vec<u8>: From<Bytes>`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants