Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improve to_file_path() #1018

Merged
merged 6 commits into from
Feb 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions url/benches/parse_url.rs
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,19 @@ fn punycode_rtl(bench: &mut Bencher) {
bench.iter(|| black_box(url).parse::<Url>().unwrap());
}

fn url_to_file_path(bench: &mut Bencher) {
let url = if cfg!(windows) {
"file:///C:/dir/next_dir/sub_sub_dir/testing/testing.json"
} else {
"file:///data/dir/next_dir/sub_sub_dir/testing/testing.json"
};
let url = url.parse::<Url>().unwrap();

bench.iter(|| {
black_box(url.to_file_path().unwrap());
});
}

benchmark_group!(
benches,
short,
Expand All @@ -95,5 +108,6 @@ benchmark_group!(
punycode_ltr,
unicode_rtl,
punycode_rtl,
url_to_file_path
);
benchmark_main!(benches);
61 changes: 48 additions & 13 deletions url/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2720,7 +2720,26 @@
_ => return Err(()),
};

return file_url_segments_to_pathbuf(host, segments);
let str_len = self.as_str().len();
let estimated_capacity = if cfg!(target_os = "redox") {
let scheme_len = self.scheme().len();
let file_scheme_len = "file".len();

Check warning on line 2726 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L2725-L2726

Added lines #L2725 - L2726 were not covered by tests
// remove only // because it still has file:
if scheme_len < file_scheme_len {
let scheme_diff = file_scheme_len - scheme_len;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still wraps around when scheme is longer than 4.

Copy link
Contributor Author

@dsherret dsherret Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a check above for this:

if scheme_len < file_scheme_len { // if 5 < 4 {

(or maybe I'm misreading)

(str_len + scheme_diff).saturating_sub(2)

Check warning on line 2730 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L2728-L2730

Added lines #L2728 - L2730 were not covered by tests
} else {
let scheme_diff = scheme_len - file_scheme_len;
str_len.saturating_sub(scheme_diff + 2)

Check warning on line 2733 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L2732-L2733

Added lines #L2732 - L2733 were not covered by tests
}
} else if cfg!(windows) {
// remove scheme: - has posssible \\ for hostname
str_len.saturating_sub(self.scheme().len() + 1)

Check warning on line 2737 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L2737

Added line #L2737 was not covered by tests
} else {
// remove scheme://
str_len.saturating_sub(self.scheme().len() + 3)
};
return file_url_segments_to_pathbuf(estimated_capacity, host, segments);
}
Err(())
}
Expand Down Expand Up @@ -3030,6 +3049,7 @@
any(unix, target_os = "redox", target_os = "wasi", target_os = "hermit")
))]
fn file_url_segments_to_pathbuf(
estimated_capacity: usize,
host: Option<&str>,
segments: str::Split<'_, char>,
) -> Result<PathBuf, ()> {
Expand All @@ -3047,11 +3067,11 @@
return Err(());
}

let mut bytes = if cfg!(target_os = "redox") {
b"file:".to_vec()
} else {
Vec::new()
};
let mut bytes = Vec::new();
bytes.try_reserve(estimated_capacity).map_err(|_| ())?;
if cfg!(target_os = "redox") {
bytes.extend(b"file:");
}

for segment in segments {
bytes.push(b'/');
Expand Down Expand Up @@ -3083,22 +3103,27 @@

#[cfg(all(feature = "std", windows))]
fn file_url_segments_to_pathbuf(
estimated_capacity: usize,
host: Option<&str>,
segments: str::Split<char>,
) -> Result<PathBuf, ()> {
file_url_segments_to_pathbuf_windows(host, segments)
file_url_segments_to_pathbuf_windows(estimated_capacity, host, segments)
}

// Build this unconditionally to alleviate https://github.com/servo/rust-url/issues/102
#[cfg(feature = "std")]
#[cfg_attr(not(windows), allow(dead_code))]
fn file_url_segments_to_pathbuf_windows(
Copy link
Contributor Author

@dsherret dsherret Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before 1000 iterations within an iteration:

test url_to_file_path ... bench: 437,555 ns/iter (+/- 11,519)

After:

test url_to_file_path ... bench: 119,461 ns/iter (+/- 5,927)

estimated_capacity: usize,
host: Option<&str>,
mut segments: str::Split<'_, char>,
) -> Result<PathBuf, ()> {
use percent_encoding::percent_decode;
let mut string = if let Some(host) = host {
r"\\".to_owned() + host
use percent_encoding::percent_decode_str;
let mut string = String::new();
string.try_reserve(estimated_capacity).map_err(|_| ())?;
if let Some(host) = host {
string.push_str(r"\\");
string.push_str(host);

Check warning on line 3126 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L3122-L3126

Added lines #L3122 - L3126 were not covered by tests
} else {
let first = segments.next().ok_or(())?;

Expand All @@ -3108,7 +3133,7 @@
return Err(());
}

first.to_owned()
string.push_str(first);

Check warning on line 3136 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L3136

Added line #L3136 was not covered by tests
}

4 => {
Expand All @@ -3120,7 +3145,8 @@
return Err(());
}

first[0..1].to_owned() + ":"
string.push_str(&first[0..1]);
string.push(':');

Check warning on line 3149 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L3148-L3149

Added lines #L3148 - L3149 were not covered by tests
}

_ => return Err(()),
Expand All @@ -3131,11 +3157,20 @@
string.push('\\');

// Currently non-unicode windows paths cannot be represented
match String::from_utf8(percent_decode(segment.as_bytes()).collect()) {
match percent_decode_str(segment).decode_utf8() {

Check warning on line 3160 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L3160

Added line #L3160 was not covered by tests
Ok(s) => string.push_str(&s),
Err(..) => return Err(()),
}
}
// ensure our estimated capacity was good
if cfg!(test) {
debug_assert!(

Check warning on line 3167 in url/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

url/src/lib.rs#L3167

Added line #L3167 was not covered by tests
string.len() <= estimated_capacity,
"len: {}, capacity: {}",
string.len(),
estimated_capacity
);
}
let path = PathBuf::from(string);
debug_assert!(
path.is_absolute(),
Expand Down
Loading