Stabilize perf results #5064

jlb6740 · 2022-10-17T19:11:28Z

No description provided.

jlb6740 · 2022-10-17T19:11:37Z

/bench_x64

jlb6740 · 2022-10-17T19:40:24Z

/bench_x64

jlb6740 · 2022-10-17T19:53:42Z

Change factor shows patch effect on x64 if merged compared to current head for main.

Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT)
A negative change factor means clockticks are expected to be reduced by the patch.

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Compilation	0.001
benchmarks/blake3-simd	x86_64	Compilation	-0.008
benchmarks/bz2	x86_64	Compilation	-0.006
benchmarks/intgemm-simd	x86_64	Compilation	0.004
benchmarks/meshoptimizer	x86_64	Compilation	0.008
benchmarks/noop	x86_64	Compilation	-0.009
benchmarks/pulldown-cmark	x86_64	Compilation	-0.009
benchmarks/shootout-ackermann	x86_64	Compilation	0.007
benchmarks/shootout-base64	x86_64	Compilation	0.022
benchmarks/shootout-ctype	x86_64	Compilation	-0.006
benchmarks/shootout-ed25519	x86_64	Compilation	-0.001
benchmarks/shootout-fib2	x86_64	Compilation	0.018
benchmarks/shootout-gimli	x86_64	Compilation	0.002
benchmarks/shootout-heapsort	x86_64	Compilation	0.013
benchmarks/shootout-keccak	x86_64	Compilation	0.002
benchmarks/shootout-matrix	x86_64	Compilation	-0.007
benchmarks/shootout-memmove	x86_64	Compilation	-0.027
benchmarks/shootout-minicsv	x86_64	Compilation	-0.011
benchmarks/shootout-nestedloop	x86_64	Compilation	-0.043
benchmarks/shootout-random	x86_64	Compilation	0.003
benchmarks/shootout-ratelimit	x86_64	Compilation	0.013
benchmarks/shootout-seqhash	x86_64	Compilation	-0.004
benchmarks/shootout-sieve	x86_64	Compilation	0.011
benchmarks/shootout-switch	x86_64	Compilation	-0.006
benchmarks/shootout-xblabla20	x86_64	Compilation	0.051
benchmarks/shootout-xchacha20	x86_64	Compilation	0.018
benchmarks/spidermonkey	x86_64	Compilation	-0.002

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Instantiation	0.013
benchmarks/blake3-simd	x86_64	Instantiation	0.035
benchmarks/bz2	x86_64	Instantiation	0.012
benchmarks/intgemm-simd	x86_64	Instantiation	-0.065
benchmarks/meshoptimizer	x86_64	Instantiation	0.017
benchmarks/noop	x86_64	Instantiation	0.020
benchmarks/pulldown-cmark	x86_64	Instantiation	0.062
benchmarks/shootout-ackermann	x86_64	Instantiation	0.022
benchmarks/shootout-base64	x86_64	Instantiation	-0.003
benchmarks/shootout-ctype	x86_64	Instantiation	-0.057
benchmarks/shootout-ed25519	x86_64	Instantiation	0.006
benchmarks/shootout-fib2	x86_64	Instantiation	0.030
benchmarks/shootout-gimli	x86_64	Instantiation	0.001
benchmarks/shootout-heapsort	x86_64	Instantiation	0.021
benchmarks/shootout-keccak	x86_64	Instantiation	-0.034
benchmarks/shootout-matrix	x86_64	Instantiation	-0.083
benchmarks/shootout-memmove	x86_64	Instantiation	0.036
benchmarks/shootout-minicsv	x86_64	Instantiation	-0.019
benchmarks/shootout-nestedloop	x86_64	Instantiation	-0.013
benchmarks/shootout-random	x86_64	Instantiation	0.007
benchmarks/shootout-ratelimit	x86_64	Instantiation	0.042
benchmarks/shootout-seqhash	x86_64	Instantiation	0.001
benchmarks/shootout-sieve	x86_64	Instantiation	-0.021
benchmarks/shootout-switch	x86_64	Instantiation	-0.040
benchmarks/shootout-xblabla20	x86_64	Instantiation	0.025
benchmarks/shootout-xchacha20	x86_64	Instantiation	0.023
benchmarks/spidermonkey	x86_64	Instantiation	-0.002

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Execution	0.003
benchmarks/blake3-simd	x86_64	Execution	-0.014
benchmarks/bz2	x86_64	Execution	-0.003
benchmarks/intgemm-simd	x86_64	Execution	-0.000
benchmarks/meshoptimizer	x86_64	Execution	-0.001
benchmarks/noop	x86_64	Execution	0.008
benchmarks/pulldown-cmark	x86_64	Execution	-0.002
benchmarks/shootout-ackermann	x86_64	Execution	0.110
benchmarks/shootout-base64	x86_64	Execution	0.001
benchmarks/shootout-ctype	x86_64	Execution	-0.001
benchmarks/shootout-ed25519	x86_64	Execution	0.003
benchmarks/shootout-fib2	x86_64	Execution	0.000
benchmarks/shootout-gimli	x86_64	Execution	-0.014
benchmarks/shootout-heapsort	x86_64	Execution	0.000
benchmarks/shootout-keccak	x86_64	Execution	-0.001
benchmarks/shootout-matrix	x86_64	Execution	0.001
benchmarks/shootout-memmove	x86_64	Execution	0.001
benchmarks/shootout-minicsv	x86_64	Execution	0.000
benchmarks/shootout-nestedloop	x86_64	Execution	-0.003
benchmarks/shootout-random	x86_64	Execution	0.001
benchmarks/shootout-ratelimit	x86_64	Execution	0.003
benchmarks/shootout-seqhash	x86_64	Execution	-0.014
benchmarks/shootout-sieve	x86_64	Execution	0.000
benchmarks/shootout-switch	x86_64	Execution	-0.000
benchmarks/shootout-xblabla20	x86_64	Execution	0.037
benchmarks/shootout-xchacha20	x86_64	Execution	-0.015
benchmarks/spidermonkey	x86_64	Execution	0.000

Averages (x64):

phase	change_factor
Compilation	0.001
Execution	0.004
Instantiation	0.001

jlb6740 · 2022-10-17T19:59:39Z

/bench_x64

jlb6740 · 2022-10-17T19:59:48Z

/bench_x64

jlb6740 · 2022-10-17T19:59:54Z

/bench_x64

cfallin · 2022-10-17T20:08:47Z

Two ideas for bounding stability:

We might want to exclude instantiation time altogether from these runs. I'd prefer not to, from first principles, but they seem to have significantly more variance than the other categories. I suspect this is because instantiation is so much faster (usually) than compilation or execution. It may just be that the platform is not noise-free enough to accurately measure instantiation, and we'll need to benchmark this locally if working to improve it. Curious what others think though (@fitzgen , @alexcrichton, @abrown ?).
Could we run a "no-change test" as a control on every run? Basically, run the baseline twice, and show (i) the delta between the two baselines, and (ii)the delta between the baseline (either one) and the PR's change. We expect to see (in a perfect world) zero change in the control (baseline-to-baseline comparison) and whatever actual change in the diff run. If we see similar swings in both then we can conclude it's more likely noise. Thoughts?

jlb6740 · 2022-10-17T20:26:48Z

Change factor shows patch effect on x64 if merged compared to current head for main.

Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT)
A negative change factor means clockticks are expected to be reduced by the patch.

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Compilation	0.002
benchmarks/blake3-simd	x86_64	Compilation	0.002
benchmarks/bz2	x86_64	Compilation	0.001
benchmarks/intgemm-simd	x86_64	Compilation	0.002
benchmarks/meshoptimizer	x86_64	Compilation	0.002
benchmarks/noop	x86_64	Compilation	-0.013
benchmarks/pulldown-cmark	x86_64	Compilation	0.006
benchmarks/shootout-ackermann	x86_64	Compilation	0.002
benchmarks/shootout-base64	x86_64	Compilation	0.000
benchmarks/shootout-ctype	x86_64	Compilation	-0.001
benchmarks/shootout-ed25519	x86_64	Compilation	-0.004
benchmarks/shootout-fib2	x86_64	Compilation	0.004
benchmarks/shootout-gimli	x86_64	Compilation	-0.000
benchmarks/shootout-heapsort	x86_64	Compilation	0.016
benchmarks/shootout-keccak	x86_64	Compilation	-0.001
benchmarks/shootout-matrix	x86_64	Compilation	-0.010
benchmarks/shootout-memmove	x86_64	Compilation	0.007
benchmarks/shootout-minicsv	x86_64	Compilation	-0.005
benchmarks/shootout-nestedloop	x86_64	Compilation	0.006
benchmarks/shootout-random	x86_64	Compilation	-0.000
benchmarks/shootout-ratelimit	x86_64	Compilation	0.002
benchmarks/shootout-seqhash	x86_64	Compilation	0.023
benchmarks/shootout-sieve	x86_64	Compilation	0.005
benchmarks/shootout-switch	x86_64	Compilation	0.005
benchmarks/shootout-xblabla20	x86_64	Compilation	0.006
benchmarks/shootout-xchacha20	x86_64	Compilation	-0.015
benchmarks/spidermonkey	x86_64	Compilation	-0.003

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Instantiation	0.002
benchmarks/blake3-simd	x86_64	Instantiation	-0.008
benchmarks/bz2	x86_64	Instantiation	-0.026
benchmarks/intgemm-simd	x86_64	Instantiation	-0.033
benchmarks/meshoptimizer	x86_64	Instantiation	-0.002
benchmarks/noop	x86_64	Instantiation	-0.014
benchmarks/pulldown-cmark	x86_64	Instantiation	0.011
benchmarks/shootout-ackermann	x86_64	Instantiation	0.017
benchmarks/shootout-base64	x86_64	Instantiation	-0.015
benchmarks/shootout-ctype	x86_64	Instantiation	-0.011
benchmarks/shootout-ed25519	x86_64	Instantiation	-0.017
benchmarks/shootout-fib2	x86_64	Instantiation	0.085
benchmarks/shootout-gimli	x86_64	Instantiation	-0.024
benchmarks/shootout-heapsort	x86_64	Instantiation	0.007
benchmarks/shootout-keccak	x86_64	Instantiation	0.029
benchmarks/shootout-matrix	x86_64	Instantiation	0.010
benchmarks/shootout-memmove	x86_64	Instantiation	0.071
benchmarks/shootout-minicsv	x86_64	Instantiation	0.044
benchmarks/shootout-nestedloop	x86_64	Instantiation	-0.017
benchmarks/shootout-random	x86_64	Instantiation	0.015
benchmarks/shootout-ratelimit	x86_64	Instantiation	0.022
benchmarks/shootout-seqhash	x86_64	Instantiation	-0.036
benchmarks/shootout-sieve	x86_64	Instantiation	0.030
benchmarks/shootout-switch	x86_64	Instantiation	-0.005
benchmarks/shootout-xblabla20	x86_64	Instantiation	-0.024
benchmarks/shootout-xchacha20	x86_64	Instantiation	0.035
benchmarks/spidermonkey	x86_64	Instantiation	0.059

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Execution	-0.004
benchmarks/blake3-simd	x86_64	Execution	-0.023
benchmarks/bz2	x86_64	Execution	0.014
benchmarks/intgemm-simd	x86_64	Execution	0.001
benchmarks/meshoptimizer	x86_64	Execution	-0.000
benchmarks/noop	x86_64	Execution	0.049
benchmarks/pulldown-cmark	x86_64	Execution	0.005
benchmarks/shootout-ackermann	x86_64	Execution	0.026
benchmarks/shootout-base64	x86_64	Execution	-0.001
benchmarks/shootout-ctype	x86_64	Execution	-0.001
benchmarks/shootout-ed25519	x86_64	Execution	0.002
benchmarks/shootout-fib2	x86_64	Execution	-0.000
benchmarks/shootout-gimli	x86_64	Execution	-0.003
benchmarks/shootout-heapsort	x86_64	Execution	0.000
benchmarks/shootout-keccak	x86_64	Execution	0.001
benchmarks/shootout-matrix	x86_64	Execution	-0.004
benchmarks/shootout-memmove	x86_64	Execution	-0.000
benchmarks/shootout-minicsv	x86_64	Execution	-0.000
benchmarks/shootout-nestedloop	x86_64	Execution	0.005
benchmarks/shootout-random	x86_64	Execution	-0.001
benchmarks/shootout-ratelimit	x86_64	Execution	-0.009
benchmarks/shootout-seqhash	x86_64	Execution	-0.003
benchmarks/shootout-sieve	x86_64	Execution	0.000
benchmarks/shootout-switch	x86_64	Execution	0.000
benchmarks/shootout-xblabla20	x86_64	Execution	0.003
benchmarks/shootout-xchacha20	x86_64	Execution	-0.007
benchmarks/spidermonkey	x86_64	Execution	-0.003

Averages (x64):

phase	change_factor
Compilation	0.001
Execution	0.002
Instantiation	0.008

alexcrichton · 2022-10-17T20:54:57Z

In my experience even with dedicated hardware I've always had a lot of noise in time-based measurements, so for long-term regression testing which this is intended for would it be possible to measure instructions retired instead of wall-time? (which I think clock-cycles is more-or-less equivalent to). That's what rust-lang/rust uses by deault and instructions are typically quite stable (although not 100% still).

Also, as a minor thing, would it be possible to print the changes as %-based changes instead of factor-based changes?

jlb6740 · 2022-10-17T20:59:59Z

Change factor shows patch effect on x64 if merged compared to current head for main.

Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT)
A negative change factor means clockticks are expected to be reduced by the patch.

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Compilation	-0.001
benchmarks/blake3-simd	x86_64	Compilation	0.022
benchmarks/bz2	x86_64	Compilation	-0.008
benchmarks/intgemm-simd	x86_64	Compilation	-0.005
benchmarks/meshoptimizer	x86_64	Compilation	0.001
benchmarks/noop	x86_64	Compilation	0.027
benchmarks/pulldown-cmark	x86_64	Compilation	-0.000
benchmarks/shootout-ackermann	x86_64	Compilation	0.005
benchmarks/shootout-base64	x86_64	Compilation	-0.000
benchmarks/shootout-ctype	x86_64	Compilation	0.018
benchmarks/shootout-ed25519	x86_64	Compilation	-0.007
benchmarks/shootout-fib2	x86_64	Compilation	0.006
benchmarks/shootout-gimli	x86_64	Compilation	-0.012
benchmarks/shootout-heapsort	x86_64	Compilation	0.006
benchmarks/shootout-keccak	x86_64	Compilation	-0.000
benchmarks/shootout-matrix	x86_64	Compilation	-0.021
benchmarks/shootout-memmove	x86_64	Compilation	0.012
benchmarks/shootout-minicsv	x86_64	Compilation	-0.028
benchmarks/shootout-nestedloop	x86_64	Compilation	-0.007
benchmarks/shootout-random	x86_64	Compilation	0.007
benchmarks/shootout-ratelimit	x86_64	Compilation	0.011
benchmarks/shootout-seqhash	x86_64	Compilation	-0.011
benchmarks/shootout-sieve	x86_64	Compilation	0.005
benchmarks/shootout-switch	x86_64	Compilation	0.001
benchmarks/shootout-xblabla20	x86_64	Compilation	-0.005
benchmarks/shootout-xchacha20	x86_64	Compilation	0.004
benchmarks/spidermonkey	x86_64	Compilation	-0.005

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Instantiation	-0.014
benchmarks/blake3-simd	x86_64	Instantiation	0.010
benchmarks/bz2	x86_64	Instantiation	0.038
benchmarks/intgemm-simd	x86_64	Instantiation	-0.035
benchmarks/meshoptimizer	x86_64	Instantiation	0.017
benchmarks/noop	x86_64	Instantiation	-0.013
benchmarks/pulldown-cmark	x86_64	Instantiation	0.039
benchmarks/shootout-ackermann	x86_64	Instantiation	-0.004
benchmarks/shootout-base64	x86_64	Instantiation	-0.010
benchmarks/shootout-ctype	x86_64	Instantiation	0.031
benchmarks/shootout-ed25519	x86_64	Instantiation	0.031
benchmarks/shootout-fib2	x86_64	Instantiation	0.028
benchmarks/shootout-gimli	x86_64	Instantiation	-0.102
benchmarks/shootout-heapsort	x86_64	Instantiation	-0.040
benchmarks/shootout-keccak	x86_64	Instantiation	0.012
benchmarks/shootout-matrix	x86_64	Instantiation	0.045
benchmarks/shootout-memmove	x86_64	Instantiation	-0.025
benchmarks/shootout-minicsv	x86_64	Instantiation	0.085
benchmarks/shootout-nestedloop	x86_64	Instantiation	0.042
benchmarks/shootout-random	x86_64	Instantiation	0.031
benchmarks/shootout-ratelimit	x86_64	Instantiation	0.037
benchmarks/shootout-seqhash	x86_64	Instantiation	0.008
benchmarks/shootout-sieve	x86_64	Instantiation	0.005
benchmarks/shootout-switch	x86_64	Instantiation	0.050
benchmarks/shootout-xblabla20	x86_64	Instantiation	-0.015
benchmarks/shootout-xchacha20	x86_64	Instantiation	-0.020
benchmarks/spidermonkey	x86_64	Instantiation	-0.033

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Execution	-0.014
benchmarks/blake3-simd	x86_64	Execution	0.006
benchmarks/bz2	x86_64	Execution	0.017
benchmarks/intgemm-simd	x86_64	Execution	0.000
benchmarks/meshoptimizer	x86_64	Execution	0.002
benchmarks/noop	x86_64	Execution	0.067
benchmarks/pulldown-cmark	x86_64	Execution	0.003
benchmarks/shootout-ackermann	x86_64	Execution	-0.191
benchmarks/shootout-base64	x86_64	Execution	-0.002
benchmarks/shootout-ctype	x86_64	Execution	-0.000
benchmarks/shootout-ed25519	x86_64	Execution	-0.001
benchmarks/shootout-fib2	x86_64	Execution	0.000
benchmarks/shootout-gimli	x86_64	Execution	-0.057
benchmarks/shootout-heapsort	x86_64	Execution	0.000
benchmarks/shootout-keccak	x86_64	Execution	-0.001
benchmarks/shootout-matrix	x86_64	Execution	0.000
benchmarks/shootout-memmove	x86_64	Execution	-0.000
benchmarks/shootout-minicsv	x86_64	Execution	0.000
benchmarks/shootout-nestedloop	x86_64	Execution	-0.000
benchmarks/shootout-random	x86_64	Execution	0.000
benchmarks/shootout-ratelimit	x86_64	Execution	0.003
benchmarks/shootout-seqhash	x86_64	Execution	-0.000
benchmarks/shootout-sieve	x86_64	Execution	0.001
benchmarks/shootout-switch	x86_64	Execution	-0.000
benchmarks/shootout-xblabla20	x86_64	Execution	-0.009
benchmarks/shootout-xchacha20	x86_64	Execution	-0.003
benchmarks/spidermonkey	x86_64	Execution	-0.000

Averages (x64):

phase	change_factor
Compilation	0.001
Execution	-0.007
Instantiation	0.007

jlb6740 · 2022-10-17T21:33:05Z

Change factor shows patch effect on x64 if merged compared to current head for main.

Results are based on clocktick (CT) event cycles. Change Factor = (Patched_CT - Main_CT) / (Main_CT)
A negative change factor means clockticks are expected to be reduced by the patch.

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Compilation	0.012
benchmarks/blake3-simd	x86_64	Compilation	-0.011
benchmarks/bz2	x86_64	Compilation	-0.008
benchmarks/intgemm-simd	x86_64	Compilation	0.001
benchmarks/meshoptimizer	x86_64	Compilation	0.008
benchmarks/noop	x86_64	Compilation	0.004
benchmarks/pulldown-cmark	x86_64	Compilation	0.022
benchmarks/shootout-ackermann	x86_64	Compilation	-0.029
benchmarks/shootout-base64	x86_64	Compilation	-0.009
benchmarks/shootout-ctype	x86_64	Compilation	-0.001
benchmarks/shootout-ed25519	x86_64	Compilation	-0.003
benchmarks/shootout-fib2	x86_64	Compilation	0.003
benchmarks/shootout-gimli	x86_64	Compilation	0.039
benchmarks/shootout-heapsort	x86_64	Compilation	-0.031
benchmarks/shootout-keccak	x86_64	Compilation	0.003
benchmarks/shootout-matrix	x86_64	Compilation	0.009
benchmarks/shootout-memmove	x86_64	Compilation	-0.003
benchmarks/shootout-minicsv	x86_64	Compilation	-0.002
benchmarks/shootout-nestedloop	x86_64	Compilation	0.000
benchmarks/shootout-random	x86_64	Compilation	0.006
benchmarks/shootout-ratelimit	x86_64	Compilation	0.001
benchmarks/shootout-seqhash	x86_64	Compilation	-0.006
benchmarks/shootout-sieve	x86_64	Compilation	-0.004
benchmarks/shootout-switch	x86_64	Compilation	0.012
benchmarks/shootout-xblabla20	x86_64	Compilation	0.006
benchmarks/shootout-xchacha20	x86_64	Compilation	-0.003
benchmarks/spidermonkey	x86_64	Compilation	-0.005

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Instantiation	0.009
benchmarks/blake3-simd	x86_64	Instantiation	0.070
benchmarks/bz2	x86_64	Instantiation	0.006
benchmarks/intgemm-simd	x86_64	Instantiation	0.047
benchmarks/meshoptimizer	x86_64	Instantiation	0.031
benchmarks/noop	x86_64	Instantiation	0.030
benchmarks/pulldown-cmark	x86_64	Instantiation	-0.037
benchmarks/shootout-ackermann	x86_64	Instantiation	0.010
benchmarks/shootout-base64	x86_64	Instantiation	0.035
benchmarks/shootout-ctype	x86_64	Instantiation	-0.015
benchmarks/shootout-ed25519	x86_64	Instantiation	-0.025
benchmarks/shootout-fib2	x86_64	Instantiation	-0.027
benchmarks/shootout-gimli	x86_64	Instantiation	0.034
benchmarks/shootout-heapsort	x86_64	Instantiation	0.044
benchmarks/shootout-keccak	x86_64	Instantiation	-0.023
benchmarks/shootout-matrix	x86_64	Instantiation	0.036
benchmarks/shootout-memmove	x86_64	Instantiation	0.043
benchmarks/shootout-minicsv	x86_64	Instantiation	-0.064
benchmarks/shootout-nestedloop	x86_64	Instantiation	-0.015
benchmarks/shootout-random	x86_64	Instantiation	-0.025
benchmarks/shootout-ratelimit	x86_64	Instantiation	0.014
benchmarks/shootout-seqhash	x86_64	Instantiation	0.012
benchmarks/shootout-sieve	x86_64	Instantiation	0.025
benchmarks/shootout-switch	x86_64	Instantiation	-0.070
benchmarks/shootout-xblabla20	x86_64	Instantiation	-0.021
benchmarks/shootout-xchacha20	x86_64	Instantiation	-0.018
benchmarks/spidermonkey	x86_64	Instantiation	-0.059

wasm	arch	phase	change_factor
benchmarks/blake3-scalar	x86_64	Execution	-0.023
benchmarks/blake3-simd	x86_64	Execution	0.001
benchmarks/bz2	x86_64	Execution	-0.004
benchmarks/intgemm-simd	x86_64	Execution	-0.002
benchmarks/meshoptimizer	x86_64	Execution	0.001
benchmarks/noop	x86_64	Execution	0.100
benchmarks/pulldown-cmark	x86_64	Execution	-0.003
benchmarks/shootout-ackermann	x86_64	Execution	-0.080
benchmarks/shootout-base64	x86_64	Execution	-0.001
benchmarks/shootout-ctype	x86_64	Execution	-0.000
benchmarks/shootout-ed25519	x86_64	Execution	-0.000
benchmarks/shootout-fib2	x86_64	Execution	-0.000
benchmarks/shootout-gimli	x86_64	Execution	-0.015
benchmarks/shootout-heapsort	x86_64	Execution	-0.000
benchmarks/shootout-keccak	x86_64	Execution	-0.004
benchmarks/shootout-matrix	x86_64	Execution	-0.001
benchmarks/shootout-memmove	x86_64	Execution	0.000
benchmarks/shootout-minicsv	x86_64	Execution	-0.001
benchmarks/shootout-nestedloop	x86_64	Execution	0.003
benchmarks/shootout-random	x86_64	Execution	-0.000
benchmarks/shootout-ratelimit	x86_64	Execution	0.003
benchmarks/shootout-seqhash	x86_64	Execution	0.009
benchmarks/shootout-sieve	x86_64	Execution	0.000
benchmarks/shootout-switch	x86_64	Execution	-0.001
benchmarks/shootout-xblabla20	x86_64	Execution	0.003
benchmarks/shootout-xchacha20	x86_64	Execution	0.003
benchmarks/spidermonkey	x86_64	Execution	0.002

Averages (x64):

phase	change_factor
Compilation	0.001
Execution	-0.000
Instantiation	0.002

fitzgen · 2022-10-18T19:37:38Z

Two ideas for bounding stability:

We might want to exclude instantiation time altogether from these runs. I'd prefer not to, from first principles, but they seem to have significantly more variance than the other categories. I suspect this is because instantiation is so much faster (usually) than compilation or execution. It may just be that the platform is not noise-free enough to accurately measure instantiation, and we'll need to benchmark this locally if working to improve it. Curious what others think though (@fitzgen , @alexcrichton, @abrown ?).

Seems fine to exclude instantiation. We have decent instantiation benchmarks in criterion anyways.

Could we run a "no-change test" as a control on every run? Basically, run the baseline twice, and show (i) the delta between the two baselines, and (ii)the delta between the baseline (either one) and the PR's change. We expect to see (in a perfect world) zero change in the control (baseline-to-baseline comparison) and whatever actual change in the diff run. If we see similar swings in both then we can conclude it's more likely noise. Thoughts?

This is more something for the sightglass-analysis crate than the github bot, IMO. The github bot shouldn't be growing anything other than what is needed to run sightglass on the server, authenticate who is allowed to do that, and report the results back. All the details of actually running benchmarks and doing analysis on them should be in sightglass itself.

cfallin · 2022-10-18T20:16:37Z

This is more something for the sightglass-analysis crate than the github bot, IMO. The github bot shouldn't be growing anything other than what is needed to run sightglass on the server, authenticate who is allowed to do that, and report the results back. All the details of actually running benchmarks and doing analysis on them should be in sightglass itself.

Yeah, that's a good point actually; I agree. My main concern was that we have trustworthy results and actually using the confidence-interval computation is the best way of doing that.

cfallin · 2022-10-18T20:18:43Z

(And following on that a bit more, I guess what I really want is to sort of build up trust in the tool from first principles -- that's what I was trying to get at with the null-diff control; so perhaps this is a way we can validate the confidence interval reporting, when we get it integrated. If we submit an empty PR and benchmark it, we should see "no statistical difference" everywhere, or else we have a stats or configuration/setup bug)

fitzgen · 2022-10-18T21:01:33Z

(Note that the probability of a false positive is 1% (due to our default significance level) but this is per test and we do 3 tests per Wasm input so we only need to have ~33 Wasm inputs to expect one false positive per benchmark run. One of the many reasons to choose our Wasm inputs carefully.)

Update format of benchmark results

cb8744b

Increase sightglass process and iterations to stablize run results

b9f94fb

jlb6740 mentioned this pull request Oct 17, 2022

Update format of benchmark results #5060

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilize perf results #5064

Stabilize perf results #5064

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

cfallin commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

alexcrichton commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

fitzgen commented Oct 18, 2022 •

edited

Loading

cfallin commented Oct 18, 2022

cfallin commented Oct 18, 2022 •

edited

Loading

fitzgen commented Oct 18, 2022

Stabilize perf results #5064

Are you sure you want to change the base?

Stabilize perf results #5064

Conversation

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

cfallin commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

alexcrichton commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

jlb6740 commented Oct 17, 2022

fitzgen commented Oct 18, 2022 • edited Loading

cfallin commented Oct 18, 2022

cfallin commented Oct 18, 2022 • edited Loading

fitzgen commented Oct 18, 2022

fitzgen commented Oct 18, 2022 •

edited

Loading

cfallin commented Oct 18, 2022 •

edited

Loading