-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid reallocation for String body extraction #2857
base: main
Are you sure you want to change the base?
Conversation
Looks like this requires us to raise our minimum version of |
let string = std::str::from_utf8(&bytes) | ||
.map_err(InvalidUtf8::from_err)? | ||
.to_owned(); | ||
let string = String::from_utf8(bytes.into()).map_err(InvalidUtf8::from_err)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't the bytes.into()
still reallocate the data? Unless there is some special code in the vtable of whatever representation of Bytes
that underlies this that checks the reference count and avoids a copy if it is 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that what this does? https://docs.rs/bytes/1.6.1/src/bytes/bytes.rs.html#1108-1128
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also briefly looked at this, and couldn't really figure it out. My feeling overall is that while it may not actually be more performant in some or all cases, clearly the bytes
authors specifically designed that impl to use the best possible implementation per specific Bytes
representation so it couldn't really hurt, and the new code in this PR is shorter than the previous too.
Of course, a benchmark that shows that the new implementation is faster would be best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I remember when I looked at bytes
, you have two ways Bytes
may be represented. Either with an owned Vec
or if there are at least two instances referencing the same allocation (even if disjunct), it gets promoted to an Arc
-backed storage.
But I think the bytes will already be cloned when the request body is collected so I think it should prevent one clone from happening.
Sorry I see, it requires a minimum version of 1.2 of |
1.2 is the minimum version that `Vec<u8>: From<Bytes>`.
The conversion from
Bytes
toString
does not need an extra allocation if we useBytes: Into<Vec<u8>>
andString::from_utf8
.