Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡ Skip replacing NAs during mapping when appropriate #6344

Merged
merged 1 commit into from
Mar 25, 2025

Conversation

teunbrand
Copy link
Collaborator

@teunbrand teunbrand commented Feb 24, 2025

This PR replaces #5032.

Briefly, it skips the NA replacment step if there are no NAs present.
Moreover vec_assign() is faster (and I presume more type stable) than ifelse().
I think it is more of an 'empirical' determination than relying on some palette attribute as is proposed in the linked PR.

Benchmarking this PR, note 4 continuous scales (x, y, size, colour):

devtools::load_all("~/packages/ggplot2/")
#> ℹ Loading ggplot2

p <- ggplot(diamonds, aes(carat, price, colour = depth, size = table)) +
  geom_point()

bench::mark(ggplot_build(p), min_iterations = 10)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 ggplot_build(p)   31.4ms   34.1ms      15.3    50.4MB     24.5

Created on 2025-02-24 with reprex v2.1.1

The same code on the main branch gives me:

#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 ggplot_build(p)   39.9ms   41.1ms      15.4    66.5MB     32.4

Created on 2025-02-24 with reprex v2.1.1

@teunbrand
Copy link
Collaborator Author

Second benchmark, this time with NAs present:

devtools::load_all("~/packages/ggplot2/")
#> ℹ Loading ggplot2

df <- diamonds
set.seed(42)
df$carat[sample(nrow(df), 5)] <- NA
df$price[sample(nrow(df), 50)] <- NA
df$depth[sample(nrow(df), 500)] <- NA
df$table[sample(nrow(df), 5000)] <- NA

p <- ggplot(df, aes(carat, price, colour = depth, size = table)) +
  geom_point()

bench::mark(ggplot_build(p), min_iterations = 10)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 ggplot_build(p)   32.4ms     36ms      14.7    53.9MB     28.0

Created on 2025-02-24 with reprex v2.1.1

Same for main:

#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 ggplot_build(p)   40.6ms   43.4ms      13.4    68.7MB     28.2

Copy link
Member

@thomasp85 thomasp85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@teunbrand teunbrand merged commit bbc4d53 into tidyverse:main Mar 25, 2025
13 checks passed
@teunbrand teunbrand deleted the skip_na_replacement branch March 25, 2025 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants