Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] Remove list against list check during set #954

Open
wants to merge 3 commits into
base: gh/vmoens/9/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Aug 9, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: 9fb608e476b16a323ebbe8198e2ca94463007f58
Pull Request resolved: #954
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 9, 2024
Copy link

github-actions bot commented Aug 9, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 40.1150μs 19.7383μs 50.6629 KOps/s 49.4665 KOps/s $\color{#35bf28}+2.42\%$
test_plain_set_stack_nested 46.4980μs 20.2945μs 49.2744 KOps/s 49.3889 KOps/s $\color{#d91a1a}-0.23\%$
test_plain_set_nested_inplace 53.3890μs 21.4868μs 46.5402 KOps/s 46.1589 KOps/s $\color{#35bf28}+0.83\%$
test_plain_set_stack_nested_inplace 53.9020μs 21.1621μs 47.2543 KOps/s 46.5852 KOps/s $\color{#35bf28}+1.44\%$
test_items 21.8310μs 4.0399μs 247.5284 KOps/s 230.5434 KOps/s $\textbf{\color{#35bf28}+7.37\%}$
test_items_nested 0.7802ms 0.3595ms 2.7814 KOps/s 2.7969 KOps/s $\color{#d91a1a}-0.55\%$
test_items_nested_locked 0.5382ms 0.3567ms 2.8036 KOps/s 2.7812 KOps/s $\color{#35bf28}+0.81\%$
test_items_nested_leaf 0.1307ms 68.4107μs 14.6176 KOps/s 14.3081 KOps/s $\color{#35bf28}+2.16\%$
test_items_stack_nested 0.6691ms 0.3617ms 2.7646 KOps/s 2.7581 KOps/s $\color{#35bf28}+0.24\%$
test_items_stack_nested_leaf 0.1701ms 68.4680μs 14.6054 KOps/s 13.8080 KOps/s $\textbf{\color{#35bf28}+5.77\%}$
test_items_stack_nested_locked 0.5432ms 0.3598ms 2.7791 KOps/s 2.7542 KOps/s $\color{#35bf28}+0.90\%$
test_keys 22.4120μs 3.6950μs 270.6362 KOps/s 281.3704 KOps/s $\color{#d91a1a}-3.81\%$
test_keys_nested 0.2149ms 0.1008ms 9.9186 KOps/s 9.9594 KOps/s $\color{#d91a1a}-0.41\%$
test_keys_nested_locked 1.9710ms 0.1049ms 9.5312 KOps/s 9.3676 KOps/s $\color{#35bf28}+1.75\%$
test_keys_nested_leaf 0.1724ms 82.7978μs 12.0776 KOps/s 11.9943 KOps/s $\color{#35bf28}+0.69\%$
test_keys_stack_nested 0.1669ms 0.1003ms 9.9696 KOps/s 9.9573 KOps/s $\color{#35bf28}+0.12\%$
test_keys_stack_nested_leaf 0.1462ms 82.4668μs 12.1261 KOps/s 12.1244 KOps/s $\color{#35bf28}+0.01\%$
test_keys_stack_nested_locked 0.2150ms 0.1065ms 9.3907 KOps/s 9.5643 KOps/s $\color{#d91a1a}-1.82\%$
test_values 12.9216μs 1.0982μs 910.5636 KOps/s 900.0454 KOps/s $\color{#35bf28}+1.17\%$
test_values_nested 0.1196ms 72.3659μs 13.8187 KOps/s 13.8428 KOps/s $\color{#d91a1a}-0.17\%$
test_values_nested_locked 0.1328ms 72.9082μs 13.7159 KOps/s 13.8559 KOps/s $\color{#d91a1a}-1.01\%$
test_values_nested_leaf 0.1362ms 61.4192μs 16.2816 KOps/s 16.0987 KOps/s $\color{#35bf28}+1.14\%$
test_values_stack_nested 0.1284ms 73.5232μs 13.6012 KOps/s 13.9013 KOps/s $\color{#d91a1a}-2.16\%$
test_values_stack_nested_leaf 0.1131ms 61.9023μs 16.1545 KOps/s 16.9245 KOps/s $\color{#d91a1a}-4.55\%$
test_values_stack_nested_locked 0.1308ms 73.2136μs 13.6587 KOps/s 13.9002 KOps/s $\color{#d91a1a}-1.74\%$
test_membership 14.2470μs 0.8653μs 1.1557 MOps/s 1.1593 MOps/s $\color{#d91a1a}-0.31\%$
test_membership_nested 18.2350μs 2.7370μs 365.3684 KOps/s 354.4081 KOps/s $\color{#35bf28}+3.09\%$
test_membership_nested_leaf 27.9620μs 2.7241μs 367.0980 KOps/s 361.5868 KOps/s $\color{#35bf28}+1.52\%$
test_membership_stacked_nested 26.2900μs 2.7221μs 367.3620 KOps/s 360.2826 KOps/s $\color{#35bf28}+1.96\%$
test_membership_stacked_nested_leaf 26.2790μs 2.7283μs 366.5230 KOps/s 356.9203 KOps/s $\color{#35bf28}+2.69\%$
test_membership_nested_last 24.9770μs 3.9752μs 251.5590 KOps/s 248.8915 KOps/s $\color{#35bf28}+1.07\%$
test_membership_nested_leaf_last 27.3620μs 3.9141μs 255.4896 KOps/s 250.7186 KOps/s $\color{#35bf28}+1.90\%$
test_membership_stacked_nested_last 26.1890μs 3.9247μs 254.7972 KOps/s 218.3794 KOps/s $\textbf{\color{#35bf28}+16.68\%}$
test_membership_stacked_nested_leaf_last 24.3260μs 3.8868μs 257.2782 KOps/s 215.4700 KOps/s $\textbf{\color{#35bf28}+19.40\%}$
test_nested_getleaf 34.2640μs 10.6373μs 94.0088 KOps/s 93.3134 KOps/s $\color{#35bf28}+0.75\%$
test_nested_get 31.6200μs 10.1316μs 98.7009 KOps/s 98.0295 KOps/s $\color{#35bf28}+0.68\%$
test_stacked_getleaf 33.2630μs 10.6685μs 93.7343 KOps/s 93.5372 KOps/s $\color{#35bf28}+0.21\%$
test_stacked_get 30.6270μs 10.1489μs 98.5333 KOps/s 96.6882 KOps/s $\color{#35bf28}+1.91\%$
test_nested_getitemleaf 54.2020μs 10.9607μs 91.2352 KOps/s 89.4542 KOps/s $\color{#35bf28}+1.99\%$
test_nested_getitem 30.9880μs 10.1212μs 98.8022 KOps/s 97.1118 KOps/s $\color{#35bf28}+1.74\%$
test_stacked_getitemleaf 33.9840μs 10.8612μs 92.0706 KOps/s 90.9896 KOps/s $\color{#35bf28}+1.19\%$
test_stacked_getitem 30.9080μs 10.0860μs 99.1474 KOps/s 97.1304 KOps/s $\color{#35bf28}+2.08\%$
test_lock_nested 83.4084ms 0.5750ms 1.7392 KOps/s 2.0848 KOps/s $\textbf{\color{#d91a1a}-16.58\%}$
test_lock_stack_nested 0.6984ms 0.4535ms 2.2050 KOps/s 2.2707 KOps/s $\color{#d91a1a}-2.89\%$
test_unlock_nested 85.3088ms 0.4933ms 2.0273 KOps/s 2.4705 KOps/s $\textbf{\color{#d91a1a}-17.94\%}$
test_unlock_stack_nested 0.8051ms 0.3708ms 2.6971 KOps/s 2.7645 KOps/s $\color{#d91a1a}-2.44\%$
test_flatten_speed 0.1849ms 89.9147μs 11.1217 KOps/s 11.3814 KOps/s $\color{#d91a1a}-2.28\%$
test_unflatten_speed 1.0302ms 0.4563ms 2.1913 KOps/s 2.1855 KOps/s $\color{#35bf28}+0.27\%$
test_common_ops 4.5597ms 1.0761ms 929.2457 Ops/s 899.3570 Ops/s $\color{#35bf28}+3.32\%$
test_creation 38.5730μs 2.1173μs 472.2923 KOps/s 479.8236 KOps/s $\color{#d91a1a}-1.57\%$
test_creation_empty 45.6060μs 16.4156μs 60.9176 KOps/s 57.4417 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_creation_nested_1 54.2920μs 19.7473μs 50.6399 KOps/s 50.0200 KOps/s $\color{#35bf28}+1.24\%$
test_creation_nested_2 78.8280μs 24.0381μs 41.6007 KOps/s 40.6807 KOps/s $\color{#35bf28}+2.26\%$
test_clone 65.4330μs 16.7967μs 59.5357 KOps/s 59.7133 KOps/s $\color{#d91a1a}-0.30\%$
test_getitem[int] 1.1702ms 16.5771μs 60.3242 KOps/s 61.7555 KOps/s $\color{#d91a1a}-2.32\%$
test_getitem[slice_int] 0.1585ms 30.9906μs 32.2679 KOps/s 33.1283 KOps/s $\color{#d91a1a}-2.60\%$
test_getitem[range] 0.2089ms 56.8147μs 17.6011 KOps/s 17.5401 KOps/s $\color{#35bf28}+0.35\%$
test_getitem[tuple] 0.1416ms 25.7417μs 38.8475 KOps/s 40.9821 KOps/s $\textbf{\color{#d91a1a}-5.21\%}$
test_getitem[list] 0.1787ms 52.3391μs 19.1062 KOps/s 18.9491 KOps/s $\color{#35bf28}+0.83\%$
test_setitem_dim[int] 55.6340μs 32.2247μs 31.0321 KOps/s 31.1241 KOps/s $\color{#d91a1a}-0.30\%$
test_setitem_dim[slice_int] 0.1441ms 62.6223μs 15.9687 KOps/s 16.7338 KOps/s $\color{#d91a1a}-4.57\%$
test_setitem_dim[range] 0.1359ms 84.1587μs 11.8823 KOps/s 12.1824 KOps/s $\color{#d91a1a}-2.46\%$
test_setitem_dim[tuple] 95.7400μs 49.1811μs 20.3330 KOps/s 20.6509 KOps/s $\color{#d91a1a}-1.54\%$
test_setitem 93.9060μs 28.2929μs 35.3446 KOps/s 33.7670 KOps/s $\color{#35bf28}+4.67\%$
test_set 84.5690μs 27.5319μs 36.3215 KOps/s 35.2861 KOps/s $\color{#35bf28}+2.93\%$
test_set_shared 1.2832ms 0.2112ms 4.7353 KOps/s 4.7759 KOps/s $\color{#d91a1a}-0.85\%$
test_update 0.1311ms 34.2261μs 29.2175 KOps/s 28.7204 KOps/s $\color{#35bf28}+1.73\%$
test_update_nested 0.1187ms 43.8452μs 22.8075 KOps/s 22.4017 KOps/s $\color{#35bf28}+1.81\%$
test_update__nested 86.2620μs 33.8968μs 29.5013 KOps/s 29.2102 KOps/s $\color{#35bf28}+1.00\%$
test_set_nested 0.1273ms 30.3037μs 32.9992 KOps/s 31.9697 KOps/s $\color{#35bf28}+3.22\%$
test_set_nested_new 0.1330ms 35.2767μs 28.3473 KOps/s 27.5469 KOps/s $\color{#35bf28}+2.91\%$
test_select 0.1516ms 52.4972μs 19.0486 KOps/s 18.9478 KOps/s $\color{#35bf28}+0.53\%$
test_select_nested 0.1149ms 59.2944μs 16.8650 KOps/s 16.6737 KOps/s $\color{#35bf28}+1.15\%$
test_exclude_nested 0.1650ms 75.0296μs 13.3281 KOps/s 13.2260 KOps/s $\color{#35bf28}+0.77\%$
test_empty[True] 0.4762ms 0.3167ms 3.1578 KOps/s 3.1281 KOps/s $\color{#35bf28}+0.95\%$
test_empty[False] 9.8535μs 1.1703μs 854.4639 KOps/s 816.8465 KOps/s $\color{#35bf28}+4.61\%$
test_unbind_speed 0.6077ms 0.3053ms 3.2750 KOps/s 3.4478 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_unbind_speed_stack0 0.4739ms 0.2974ms 3.3628 KOps/s 3.5477 KOps/s $\textbf{\color{#d91a1a}-5.21\%}$
test_unbind_speed_stack1 87.2242ms 0.8038ms 1.2442 KOps/s 1.5362 KOps/s $\textbf{\color{#d91a1a}-19.01\%}$
test_split 87.1550ms 2.1521ms 464.6529 Ops/s 476.1876 Ops/s $\color{#d91a1a}-2.42\%$
test_chunk 3.0461ms 1.9780ms 505.5665 Ops/s 466.7409 Ops/s $\textbf{\color{#35bf28}+8.32\%}$
test_creation[device0] 0.3829ms 0.1185ms 8.4362 KOps/s 8.7649 KOps/s $\color{#d91a1a}-3.75\%$
test_creation_from_tensor 0.2904ms 0.1166ms 8.5762 KOps/s 8.6368 KOps/s $\color{#d91a1a}-0.70\%$
test_add_one[memmap_tensor0] 0.1734ms 7.3864μs 135.3832 KOps/s 143.0983 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_contiguous[memmap_tensor0] 24.0450μs 1.8929μs 528.3024 KOps/s 532.4148 KOps/s $\color{#d91a1a}-0.77\%$
test_stack[memmap_tensor0] 49.0420μs 5.8472μs 171.0223 KOps/s 184.8916 KOps/s $\textbf{\color{#d91a1a}-7.50\%}$
test_memmaptd_index 1.0773ms 0.3933ms 2.5425 KOps/s 2.6050 KOps/s $\color{#d91a1a}-2.40\%$
test_memmaptd_index_astensor 0.9287ms 0.4709ms 2.1235 KOps/s 2.1617 KOps/s $\color{#d91a1a}-1.76\%$
test_memmaptd_index_op 91.4908ms 1.0751ms 930.1229 Ops/s 1.0144 KOps/s $\textbf{\color{#d91a1a}-8.31\%}$
test_serialize_model 0.1230s 0.1158s 8.6391 Ops/s 8.2147 Ops/s $\textbf{\color{#35bf28}+5.17\%}$
test_serialize_model_pickle 0.4587s 0.3899s 2.5645 Ops/s 2.5436 Ops/s $\color{#35bf28}+0.82\%$
test_serialize_weights 0.1228s 0.1153s 8.6764 Ops/s 8.6851 Ops/s $\color{#d91a1a}-0.10\%$
test_serialize_weights_returnearly 0.2398s 0.1687s 5.9262 Ops/s 6.2730 Ops/s $\textbf{\color{#d91a1a}-5.53\%}$
test_serialize_weights_pickle 0.9513s 0.6515s 1.5350 Ops/s 2.5138 Ops/s $\textbf{\color{#d91a1a}-38.94\%}$
test_serialize_weights_filesystem 0.1450s 0.1385s 7.2192 Ops/s 7.1314 Ops/s $\color{#35bf28}+1.23\%$
test_serialize_model_filesystem 0.2341s 0.1516s 6.5942 Ops/s 6.6695 Ops/s $\color{#d91a1a}-1.13\%$
test_reshape_pytree 0.1105ms 38.9120μs 25.6990 KOps/s 25.8018 KOps/s $\color{#d91a1a}-0.40\%$
test_reshape_td 83.5260μs 44.0732μs 22.6895 KOps/s 22.0447 KOps/s $\color{#35bf28}+2.93\%$
test_view_pytree 84.6490μs 37.9520μs 26.3490 KOps/s 26.1419 KOps/s $\color{#35bf28}+0.79\%$
test_view_td 0.1132ms 49.8092μs 20.0766 KOps/s 20.1108 KOps/s $\color{#d91a1a}-0.17\%$
test_unbind_pytree 94.0170μs 35.6310μs 28.0654 KOps/s 27.8812 KOps/s $\color{#35bf28}+0.66\%$
test_unbind_td 0.3375ms 44.9787μs 22.2327 KOps/s 23.1235 KOps/s $\color{#d91a1a}-3.85\%$
test_split_pytree 83.8180μs 37.6583μs 26.5545 KOps/s 27.0081 KOps/s $\color{#d91a1a}-1.68\%$
test_split_td 0.4676ms 56.4765μs 17.7065 KOps/s 17.7346 KOps/s $\color{#d91a1a}-0.16\%$
test_add_pytree 0.1006ms 44.3564μs 22.5447 KOps/s 22.5632 KOps/s $\color{#d91a1a}-0.08\%$
test_add_td 0.1484ms 80.3130μs 12.4513 KOps/s 12.4171 KOps/s $\color{#35bf28}+0.27\%$
test_compile_add_one_nested[tensordict-compile] 0.1342ms 59.1002μs 16.9204 KOps/s 17.5723 KOps/s $\color{#d91a1a}-3.71\%$
test_compile_add_one_nested[tensordict-eager] 0.2949ms 0.1746ms 5.7279 KOps/s 5.6538 KOps/s $\color{#35bf28}+1.31\%$
test_compile_add_one_nested[pytree-compile] 0.1317ms 57.3965μs 17.4227 KOps/s 17.6222 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_add_one_nested[pytree-eager] 0.2580ms 0.1394ms 7.1732 KOps/s 7.2673 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_copy_nested[tensordict-compile] 63.5690μs 21.6542μs 46.1803 KOps/s 46.7066 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_copy_nested[tensordict-eager] 0.1566ms 66.3453μs 15.0727 KOps/s 14.8424 KOps/s $\color{#35bf28}+1.55\%$
test_compile_copy_nested[pytree-compile] 0.1250ms 75.3266μs 13.2755 KOps/s 13.3163 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_copy_nested[pytree-eager] 0.1357ms 67.2917μs 14.8607 KOps/s 14.5351 KOps/s $\color{#35bf28}+2.24\%$
test_compile_add_one_flat[tensordict-compile] 0.2379ms 0.1739ms 5.7504 KOps/s 5.7786 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_add_one_flat[tensordict-eager] 0.4189ms 0.1902ms 5.2583 KOps/s 5.4181 KOps/s $\color{#d91a1a}-2.95\%$
test_compile_add_one_flat[tensorclass-compile] 0.1054ms 47.7872μs 20.9261 KOps/s 21.6364 KOps/s $\color{#d91a1a}-3.28\%$
test_compile_add_one_flat[tensorclass-eager] 0.1509ms 68.2788μs 14.6458 KOps/s 13.9306 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_compile_add_one_flat[pytree-compile] 0.3360ms 0.1731ms 5.7783 KOps/s 5.7399 KOps/s $\color{#35bf28}+0.67\%$
test_compile_add_one_flat[pytree-eager] 0.5858ms 0.2874ms 3.4790 KOps/s 3.4738 KOps/s $\color{#35bf28}+0.15\%$
test_compile_add_self_flat[tensordict-eager] 0.4320ms 0.2070ms 4.8300 KOps/s 4.9238 KOps/s $\color{#d91a1a}-1.90\%$
test_compile_add_self_flat[tensordict-compile] 0.2941ms 0.1747ms 5.7245 KOps/s 5.7158 KOps/s $\color{#35bf28}+0.15\%$
test_compile_add_self_flat[tensorclass-eager] 0.1418ms 62.5742μs 15.9810 KOps/s 16.3850 KOps/s $\color{#d91a1a}-2.47\%$
test_compile_add_self_flat[tensorclass-compile] 0.1373ms 49.4281μs 20.2314 KOps/s 20.7620 KOps/s $\color{#d91a1a}-2.56\%$
test_compile_add_self_flat[pytree-eager] 0.3521ms 0.2355ms 4.2469 KOps/s 4.3147 KOps/s $\color{#d91a1a}-1.57\%$
test_compile_add_self_flat[pytree-compile] 0.3324ms 0.1759ms 5.6859 KOps/s 5.7417 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_copy_flat[tensordict-compile] 0.2015ms 0.1022ms 9.7831 KOps/s 9.7824 KOps/s $+0.01\%$
test_compile_copy_flat[tensordict-eager] 0.1220ms 57.2601μs 17.4642 KOps/s 17.5926 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_copy_flat[pytree-compile] 0.1667ms 78.3834μs 12.7578 KOps/s 12.9804 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_copy_flat[pytree-eager] 0.1491ms 70.4844μs 14.1875 KOps/s 14.5183 KOps/s $\color{#d91a1a}-2.28\%$
test_compile_assign_and_add[tensordict-compile] 0.3119ms 0.1947ms 5.1353 KOps/s 5.1127 KOps/s $\color{#35bf28}+0.44\%$
test_compile_assign_and_add[tensordict-eager] 2.9035ms 1.6716ms 598.2164 Ops/s 620.7128 Ops/s $\color{#d91a1a}-3.62\%$
test_compile_assign_and_add[pytree-compile] 0.2674ms 0.1933ms 5.1729 KOps/s 5.1863 KOps/s $\color{#d91a1a}-0.26\%$
test_compile_assign_and_add[pytree-eager] 1.8682ms 1.1010ms 908.2541 Ops/s 920.9341 Ops/s $\color{#d91a1a}-1.38\%$
test_compile_assign_and_add_stack[compile] 0.5242ms 0.4190ms 2.3864 KOps/s 2.3692 KOps/s $\color{#35bf28}+0.73\%$
test_compile_assign_and_add_stack[eager] 5.8409ms 3.8134ms 262.2363 Ops/s 263.7058 Ops/s $\color{#d91a1a}-0.56\%$
test_compile_indexing[tensor-tensordict-compile] 0.1017ms 36.2895μs 27.5562 KOps/s 29.2975 KOps/s $\textbf{\color{#d91a1a}-5.94\%}$
test_compile_indexing[tensor-tensordict-eager] 0.9444ms 47.6049μs 21.0062 KOps/s 21.2079 KOps/s $\color{#d91a1a}-0.95\%$
test_compile_indexing[tensor-tensorclass-compile] 81.3620μs 30.8010μs 32.4665 KOps/s 34.6148 KOps/s $\textbf{\color{#d91a1a}-6.21\%}$
test_compile_indexing[tensor-tensorclass-eager] 95.2790μs 28.3546μs 35.2676 KOps/s 35.4436 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_indexing[tensor-pytree-compile] 79.7800μs 29.5814μs 33.8050 KOps/s 34.4073 KOps/s $\color{#d91a1a}-1.75\%$
test_compile_indexing[tensor-pytree-eager] 78.8280μs 28.0214μs 35.6870 KOps/s 36.3650 KOps/s $\color{#d91a1a}-1.86\%$
test_compile_indexing[slice-tensordict-compile] 0.1653ms 74.4563μs 13.4307 KOps/s 13.7149 KOps/s $\color{#d91a1a}-2.07\%$
test_compile_indexing[slice-tensordict-eager] 0.5135ms 27.9799μs 35.7399 KOps/s 37.2837 KOps/s $\color{#d91a1a}-4.14\%$
test_compile_indexing[slice-tensorclass-compile] 0.1428ms 68.3523μs 14.6301 KOps/s 14.9338 KOps/s $\color{#d91a1a}-2.03\%$
test_compile_indexing[slice-tensorclass-eager] 64.6410μs 23.2476μs 43.0151 KOps/s 43.8955 KOps/s $\color{#d91a1a}-2.01\%$
test_compile_indexing[slice-pytree-compile] 0.1475ms 67.7079μs 14.7693 KOps/s 14.8495 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_indexing[slice-pytree-eager] 84.5680μs 23.2714μs 42.9712 KOps/s 44.5776 KOps/s $\color{#d91a1a}-3.60\%$
test_compile_indexing[int-tensordict-compile] 0.1472ms 74.1242μs 13.4909 KOps/s 13.8829 KOps/s $\color{#d91a1a}-2.82\%$
test_compile_indexing[int-tensordict-eager] 0.9792ms 27.4392μs 36.4443 KOps/s 38.1500 KOps/s $\color{#d91a1a}-4.47\%$
test_compile_indexing[int-tensorclass-compile] 0.1440ms 68.2227μs 14.6579 KOps/s 14.9970 KOps/s $\color{#d91a1a}-2.26\%$
test_compile_indexing[int-tensorclass-eager] 77.0350μs 22.7952μs 43.8688 KOps/s 45.0027 KOps/s $\color{#d91a1a}-2.52\%$
test_compile_indexing[int-pytree-compile] 0.1301ms 67.8075μs 14.7476 KOps/s 14.9349 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_indexing[int-pytree-eager] 62.5480μs 23.2819μs 42.9518 KOps/s 44.8385 KOps/s $\color{#d91a1a}-4.21\%$
test_mod_add[eager] 75.7930μs 24.0139μs 41.6425 KOps/s 42.7124 KOps/s $\color{#d91a1a}-2.50\%$
test_mod_add[compile] 0.1008ms 38.7651μs 25.7964 KOps/s 26.3698 KOps/s $\color{#d91a1a}-2.17\%$
test_mod_add[compile-overhead] 91.0810μs 38.8575μs 25.7351 KOps/s 26.0766 KOps/s $\color{#d91a1a}-1.31\%$
test_mod_wrap[eager] 0.4429ms 0.2079ms 4.8104 KOps/s 4.9358 KOps/s $\color{#d91a1a}-2.54\%$
test_mod_wrap[compile] 0.4581ms 0.2344ms 4.2655 KOps/s 4.3054 KOps/s $\color{#d91a1a}-0.93\%$
test_mod_wrap[compile-overhead] 0.3418ms 0.2309ms 4.3308 KOps/s 4.3035 KOps/s $\color{#35bf28}+0.63\%$
test_mod_wrap_and_backward[eager] 12.2713ms 10.6950ms 93.5019 Ops/s 92.5880 Ops/s $\color{#35bf28}+0.99\%$
test_mod_wrap_and_backward[compile] 11.6035ms 10.7727ms 92.8272 Ops/s 93.6174 Ops/s $\color{#d91a1a}-0.84\%$
test_mod_wrap_and_backward[compile-overhead] 12.6803ms 10.7957ms 92.6295 Ops/s 87.2737 Ops/s $\textbf{\color{#35bf28}+6.14\%}$
test_seq_add[eager] 0.1665ms 88.6449μs 11.2810 KOps/s 11.6040 KOps/s $\color{#d91a1a}-2.78\%$
test_seq_add[compile] 0.1417ms 65.6393μs 15.2348 KOps/s 15.8737 KOps/s $\color{#d91a1a}-4.03\%$
test_seq_add[compile-overhead] 0.1378ms 63.7166μs 15.6945 KOps/s 15.7554 KOps/s $\color{#d91a1a}-0.39\%$
test_seq_wrap[eager] 0.6874ms 0.3788ms 2.6397 KOps/s 2.6980 KOps/s $\color{#d91a1a}-2.16\%$
test_seq_wrap[compile] 1.3432ms 0.2714ms 3.6850 KOps/s 3.6872 KOps/s $\color{#d91a1a}-0.06\%$
test_seq_wrap[compile-overhead] 1.2683ms 0.2679ms 3.7331 KOps/s 3.6760 KOps/s $\color{#35bf28}+1.55\%$
test_func_call_runtime[False-eager] 1.1731ms 0.5180ms 1.9303 KOps/s 1.9782 KOps/s $\color{#d91a1a}-2.42\%$
test_func_call_runtime[False-compile] 0.6940ms 0.5009ms 1.9965 KOps/s 2.0122 KOps/s $\color{#d91a1a}-0.78\%$
test_func_call_runtime[False-compile-overhead] 0.9481ms 0.5018ms 1.9929 KOps/s 2.0027 KOps/s $\color{#d91a1a}-0.49\%$
test_func_call_runtime[True-eager] 1.0545ms 0.7425ms 1.3469 KOps/s 1.3789 KOps/s $\color{#d91a1a}-2.32\%$
test_func_call_runtime[True-compile] 0.7843ms 0.5149ms 1.9423 KOps/s 1.9605 KOps/s $\color{#d91a1a}-0.93\%$
test_func_call_runtime[True-compile-overhead] 0.9108ms 0.5140ms 1.9455 KOps/s 1.9474 KOps/s $\color{#d91a1a}-0.10\%$
test_func_call_cm_runtime[False-eager] 0.6257ms 0.5055ms 1.9784 KOps/s 2.0186 KOps/s $\color{#d91a1a}-1.99\%$
test_func_call_cm_runtime[False-compile] 0.6120ms 0.5022ms 1.9914 KOps/s 1.9979 KOps/s $\color{#d91a1a}-0.33\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9074ms 0.5003ms 1.9988 KOps/s 1.9817 KOps/s $\color{#35bf28}+0.86\%$
test_func_call_cm_runtime[True-eager] 1.1836ms 0.8527ms 1.1728 KOps/s 1.1882 KOps/s $\color{#d91a1a}-1.30\%$
test_func_call_cm_runtime[True-compile] 1.2212ms 0.7398ms 1.3516 KOps/s 1.3636 KOps/s $\color{#d91a1a}-0.88\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8507ms 0.7379ms 1.3551 KOps/s 1.3691 KOps/s $\color{#d91a1a}-1.02\%$
test_vmap_func_call_cm_runtime[eager] 2.4606ms 1.8739ms 533.6467 Ops/s 540.5956 Ops/s $\color{#d91a1a}-1.29\%$
test_vmap_func_call_cm_runtime[compile] 3.1423ms 1.9268ms 518.9826 Ops/s 532.1211 Ops/s $\color{#d91a1a}-2.47\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.6259ms 1.9123ms 522.9173 Ops/s 531.4000 Ops/s $\color{#d91a1a}-1.60\%$
test_distributed 0.2486ms 0.1240ms 8.0641 KOps/s 7.9505 KOps/s $\color{#35bf28}+1.43\%$
test_tdmodule 68.6090μs 16.3478μs 61.1703 KOps/s 58.7542 KOps/s $\color{#35bf28}+4.11\%$
test_tdmodule_dispatch 52.5290μs 33.8422μs 29.5489 KOps/s 27.5501 KOps/s $\textbf{\color{#35bf28}+7.26\%}$
test_tdseq 35.4260μs 19.5187μs 51.2328 KOps/s 48.8280 KOps/s $\color{#35bf28}+4.93\%$
test_tdseq_dispatch 74.7600μs 39.7569μs 25.1529 KOps/s 24.5698 KOps/s $\color{#35bf28}+2.37\%$
test_instantiation_functorch 1.7537ms 1.5367ms 650.7400 Ops/s 629.1895 Ops/s $\color{#35bf28}+3.43\%$
test_instantiation_td 1.7789ms 1.1467ms 872.0914 Ops/s 856.1449 Ops/s $\color{#35bf28}+1.86\%$
test_exec_functorch 0.3234ms 0.1834ms 5.4512 KOps/s 5.4747 KOps/s $\color{#d91a1a}-0.43\%$
test_exec_functional_call 0.3242ms 0.1705ms 5.8634 KOps/s 5.7526 KOps/s $\color{#35bf28}+1.93\%$
test_exec_td 0.2282ms 0.1652ms 6.0534 KOps/s 6.1184 KOps/s $\color{#d91a1a}-1.06\%$
test_exec_td_decorator 0.5304ms 0.2185ms 4.5769 KOps/s 4.5922 KOps/s $\color{#d91a1a}-0.33\%$
test_vmap_mlp_speed[True-True] 0.8112ms 0.6415ms 1.5589 KOps/s 1.5824 KOps/s $\color{#d91a1a}-1.48\%$
test_vmap_mlp_speed[True-False] 0.9295ms 0.6402ms 1.5621 KOps/s 1.5896 KOps/s $\color{#d91a1a}-1.73\%$
test_vmap_mlp_speed[False-True] 0.5947ms 0.4957ms 2.0173 KOps/s 2.0494 KOps/s $\color{#d91a1a}-1.56\%$
test_vmap_mlp_speed[False-False] 0.6404ms 0.4994ms 2.0024 KOps/s 2.0526 KOps/s $\color{#d91a1a}-2.44\%$
test_vmap_mlp_speed_decorator[True-True] 1.2810ms 0.6255ms 1.5986 KOps/s 1.6368 KOps/s $\color{#d91a1a}-2.33\%$
test_vmap_mlp_speed_decorator[True-False] 0.8506ms 0.6248ms 1.6004 KOps/s 1.6316 KOps/s $\color{#d91a1a}-1.91\%$
test_vmap_mlp_speed_decorator[False-True] 0.9170ms 0.5174ms 1.9327 KOps/s 1.9902 KOps/s $\color{#d91a1a}-2.89\%$
test_vmap_mlp_speed_decorator[False-False] 0.7535ms 0.5174ms 1.9326 KOps/s 1.9869 KOps/s $\color{#d91a1a}-2.73\%$
test_to_module_speed[True] 1.4744ms 1.2706ms 787.0093 Ops/s 779.1970 Ops/s $\color{#35bf28}+1.00\%$
test_to_module_speed[False] 1.7253ms 1.2453ms 802.9993 Ops/s 797.9852 Ops/s $\color{#35bf28}+0.63\%$
test_tc_init 77.4350μs 42.7406μs 23.3970 KOps/s 22.8959 KOps/s $\color{#35bf28}+2.19\%$
test_tc_init_nested 0.1637ms 83.6571μs 11.9536 KOps/s 11.3049 KOps/s $\textbf{\color{#35bf28}+5.74\%}$
test_tc_first_layer_tensor 17.6530μs 1.5743μs 635.2119 KOps/s 670.2046 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_tc_first_layer_nontensor 27.6410μs 4.7234μs 211.7140 KOps/s 214.7977 KOps/s $\color{#d91a1a}-1.44\%$
test_tc_second_layer_tensor 28.3430μs 2.8386μs 352.2870 KOps/s 360.6663 KOps/s $\color{#d91a1a}-2.32\%$
test_tc_second_layer_nontensor 30.1560μs 6.0971μs 164.0134 KOps/s 169.6579 KOps/s $\color{#d91a1a}-3.33\%$
test_unbind 0.4697s 13.0217ms 76.7949 Ops/s 61.0307 Ops/s $\textbf{\color{#35bf28}+25.83\%}$
test_full_like 8.2763ms 6.9762ms 143.3447 Ops/s 143.0847 Ops/s $\color{#35bf28}+0.18\%$
test_zeros_like 3.0897ms 2.6976ms 370.7015 Ops/s 154.0677 Ops/s $\textbf{\color{#35bf28}+140.61\%}$
test_ones_like 3.5594ms 3.1534ms 317.1162 Ops/s 132.6648 Ops/s $\textbf{\color{#35bf28}+139.04\%}$
test_clone 5.6861ms 5.1173ms 195.4159 Ops/s 108.3076 Ops/s $\textbf{\color{#35bf28}+80.43\%}$
test_squeeze 59.7620μs 12.5986μs 79.3738 KOps/s 83.3463 KOps/s $\color{#d91a1a}-4.77\%$
test_unsqueeze 0.1548ms 91.4849μs 10.9308 KOps/s 11.1006 KOps/s $\color{#d91a1a}-1.53\%$
test_split 0.5841ms 0.1929ms 5.1836 KOps/s 5.1563 KOps/s $\color{#35bf28}+0.53\%$
test_permute 0.4502ms 0.2185ms 4.5764 KOps/s 4.6533 KOps/s $\color{#d91a1a}-1.65\%$
test_stack 32.5944ms 26.5675ms 37.6399 Ops/s 39.3647 Ops/s $\color{#d91a1a}-4.38\%$
test_cat 31.4787ms 26.1053ms 38.3064 Ops/s 38.9534 Ops/s $\color{#d91a1a}-1.66\%$

Copy link

github-actions bot commented Aug 9, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}36$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1250ms 13.2364μs 75.5492 KOps/s 65.8566 KOps/s $\textbf{\color{#35bf28}+14.72\%}$
test_plain_set_stack_nested 73.3020μs 13.2481μs 75.4826 KOps/s 66.7855 KOps/s $\textbf{\color{#35bf28}+13.02\%}$
test_plain_set_nested_inplace 42.7810μs 14.3277μs 69.7951 KOps/s 61.3319 KOps/s $\textbf{\color{#35bf28}+13.80\%}$
test_plain_set_stack_nested_inplace 47.4410μs 14.3021μs 69.9200 KOps/s 61.9321 KOps/s $\textbf{\color{#35bf28}+12.90\%}$
test_items 23.5900μs 2.8730μs 348.0636 KOps/s 342.9892 KOps/s $\color{#35bf28}+1.48\%$
test_items_nested 0.3817ms 0.3234ms 3.0917 KOps/s 3.0501 KOps/s $\color{#35bf28}+1.37\%$
test_items_nested_locked 0.3604ms 0.3271ms 3.0569 KOps/s 3.0550 KOps/s $\color{#35bf28}+0.06\%$
test_items_nested_leaf 94.9020μs 55.5071μs 18.0157 KOps/s 17.9417 KOps/s $\color{#35bf28}+0.41\%$
test_items_stack_nested 0.3662ms 0.3245ms 3.0816 KOps/s 3.0634 KOps/s $\color{#35bf28}+0.60\%$
test_items_stack_nested_leaf 0.1737ms 57.6431μs 17.3481 KOps/s 17.6028 KOps/s $\color{#d91a1a}-1.45\%$
test_items_stack_nested_locked 0.4028ms 0.3237ms 3.0888 KOps/s 3.0529 KOps/s $\color{#35bf28}+1.18\%$
test_keys 23.9000μs 3.4331μs 291.2834 KOps/s 292.4002 KOps/s $\color{#d91a1a}-0.38\%$
test_keys_nested 89.6620μs 55.8858μs 17.8936 KOps/s 17.7629 KOps/s $\color{#35bf28}+0.74\%$
test_keys_nested_locked 2.5901ms 61.8303μs 16.1733 KOps/s 16.0611 KOps/s $\color{#35bf28}+0.70\%$
test_keys_nested_leaf 75.7120μs 47.2920μs 21.1452 KOps/s 20.9677 KOps/s $\color{#35bf28}+0.85\%$
test_keys_stack_nested 80.9020μs 55.9039μs 17.8878 KOps/s 17.8435 KOps/s $\color{#35bf28}+0.25\%$
test_keys_stack_nested_leaf 0.2127ms 48.0721μs 20.8021 KOps/s 20.7951 KOps/s $\color{#35bf28}+0.03\%$
test_keys_stack_nested_locked 86.5620μs 61.2300μs 16.3319 KOps/s 16.4103 KOps/s $\color{#d91a1a}-0.48\%$
test_values 19.7835μs 0.8422μs 1.1874 MOps/s 1.1823 MOps/s $\color{#35bf28}+0.43\%$
test_values_nested 67.1010μs 40.6904μs 24.5758 KOps/s 24.5629 KOps/s $\color{#35bf28}+0.05\%$
test_values_nested_locked 75.2520μs 42.5573μs 23.4977 KOps/s 23.4911 KOps/s $\color{#35bf28}+0.03\%$
test_values_nested_leaf 99.6120μs 34.9466μs 28.6151 KOps/s 28.2941 KOps/s $\color{#35bf28}+1.13\%$
test_values_stack_nested 97.0120μs 41.3036μs 24.2110 KOps/s 24.0815 KOps/s $\color{#35bf28}+0.54\%$
test_values_stack_nested_leaf 60.6020μs 35.6004μs 28.0896 KOps/s 28.0298 KOps/s $\color{#35bf28}+0.21\%$
test_values_stack_nested_locked 0.1646ms 43.2372μs 23.1283 KOps/s 22.8627 KOps/s $\color{#35bf28}+1.16\%$
test_membership 1.6251μs 0.5181μs 1.9301 MOps/s 1.9659 MOps/s $\color{#d91a1a}-1.82\%$
test_membership_nested 16.5055μs 1.8679μs 535.3643 KOps/s 533.1051 KOps/s $\color{#35bf28}+0.42\%$
test_membership_nested_leaf 11.5237μs 1.8428μs 542.6513 KOps/s 554.1728 KOps/s $\color{#d91a1a}-2.08\%$
test_membership_stacked_nested 30.1300μs 1.8849μs 530.5445 KOps/s 524.7403 KOps/s $\color{#35bf28}+1.11\%$
test_membership_stacked_nested_leaf 22.5110μs 1.8918μs 528.5969 KOps/s 524.7917 KOps/s $\color{#35bf28}+0.73\%$
test_membership_nested_last 23.3910μs 2.7692μs 361.1140 KOps/s 361.7588 KOps/s $\color{#d91a1a}-0.18\%$
test_membership_nested_leaf_last 27.2500μs 2.7463μs 364.1267 KOps/s 356.9117 KOps/s $\color{#35bf28}+2.02\%$
test_membership_stacked_nested_last 33.3810μs 4.1650μs 240.0958 KOps/s 126.9178 KOps/s $\textbf{\color{#35bf28}+89.17\%}$
test_membership_stacked_nested_leaf_last 32.4810μs 4.1661μs 240.0343 KOps/s 127.6834 KOps/s $\textbf{\color{#35bf28}+87.99\%}$
test_nested_getleaf 91.3920μs 6.1048μs 163.8044 KOps/s 163.6844 KOps/s $\color{#35bf28}+0.07\%$
test_nested_get 36.3110μs 5.6836μs 175.9443 KOps/s 172.2945 KOps/s $\color{#35bf28}+2.12\%$
test_stacked_getleaf 47.4410μs 6.0234μs 166.0193 KOps/s 165.0052 KOps/s $\color{#35bf28}+0.61\%$
test_stacked_get 37.4210μs 5.7073μs 175.2135 KOps/s 175.4099 KOps/s $\color{#d91a1a}-0.11\%$
test_nested_getitemleaf 31.5210μs 6.1418μs 162.8196 KOps/s 162.4786 KOps/s $\color{#35bf28}+0.21\%$
test_nested_getitem 30.6400μs 5.7451μs 174.0610 KOps/s 172.2221 KOps/s $\color{#35bf28}+1.07\%$
test_stacked_getitemleaf 30.7910μs 6.1060μs 163.7745 KOps/s 163.2915 KOps/s $\color{#35bf28}+0.30\%$
test_stacked_getitem 35.7210μs 5.8256μs 171.6557 KOps/s 175.7970 KOps/s $\color{#d91a1a}-2.36\%$
test_lock_nested 4.6245ms 0.4189ms 2.3871 KOps/s 2.3990 KOps/s $\color{#d91a1a}-0.50\%$
test_lock_stack_nested 0.4085ms 0.3776ms 2.6481 KOps/s 2.7039 KOps/s $\color{#d91a1a}-2.06\%$
test_unlock_nested 0.7467ms 0.3541ms 2.8240 KOps/s 2.8256 KOps/s $\color{#d91a1a}-0.06\%$
test_unlock_stack_nested 0.4169ms 0.3163ms 3.1618 KOps/s 3.2479 KOps/s $\color{#d91a1a}-2.65\%$
test_flatten_speed 0.1005ms 68.7900μs 14.5370 KOps/s 14.1737 KOps/s $\color{#35bf28}+2.56\%$
test_unflatten_speed 0.3145ms 0.2741ms 3.6489 KOps/s 3.5974 KOps/s $\color{#35bf28}+1.43\%$
test_common_ops 1.5053ms 1.1991ms 833.9594 Ops/s 774.8681 Ops/s $\textbf{\color{#35bf28}+7.63\%}$
test_creation 20.9000μs 1.4828μs 674.4181 KOps/s 669.0362 KOps/s $\color{#35bf28}+0.80\%$
test_creation_empty 55.2010μs 14.1384μs 70.7295 KOps/s 57.5185 KOps/s $\textbf{\color{#35bf28}+22.97\%}$
test_creation_nested_1 51.1320μs 15.8475μs 63.1016 KOps/s 52.2314 KOps/s $\textbf{\color{#35bf28}+20.81\%}$
test_creation_nested_2 46.2910μs 18.9768μs 52.6960 KOps/s 46.0618 KOps/s $\textbf{\color{#35bf28}+14.40\%}$
test_clone 1.2965ms 29.4044μs 34.0085 KOps/s 34.8103 KOps/s $\color{#d91a1a}-2.30\%$
test_getitem[int] 94.8104ms 23.2878μs 42.9410 KOps/s 62.8846 KOps/s $\textbf{\color{#d91a1a}-31.71\%}$
test_getitem[slice_int] 0.1265ms 27.3390μs 36.5778 KOps/s 34.5945 KOps/s $\textbf{\color{#35bf28}+5.73\%}$
test_getitem[range] 0.2244ms 0.1088ms 9.1938 KOps/s 9.1292 KOps/s $\color{#35bf28}+0.71\%$
test_getitem[tuple] 0.1709ms 23.5668μs 42.4326 KOps/s 42.2708 KOps/s $\color{#35bf28}+0.38\%$
test_getitem[list] 0.2714ms 97.6893μs 10.2365 KOps/s 10.1830 KOps/s $\color{#35bf28}+0.53\%$
test_setitem_dim[int] 0.1866ms 44.2963μs 22.5752 KOps/s 22.5472 KOps/s $\color{#35bf28}+0.12\%$
test_setitem_dim[slice_int] 0.2099ms 67.0308μs 14.9185 KOps/s 14.6755 KOps/s $\color{#35bf28}+1.66\%$
test_setitem_dim[range] 0.1610ms 0.1265ms 7.9080 KOps/s 7.5296 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_setitem_dim[tuple] 0.1767ms 60.1995μs 16.6114 KOps/s 15.7025 KOps/s $\textbf{\color{#35bf28}+5.79\%}$
test_setitem 0.1915ms 41.1951μs 24.2747 KOps/s 22.8983 KOps/s $\textbf{\color{#35bf28}+6.01\%}$
test_set 0.1922ms 39.4272μs 25.3632 KOps/s 23.7530 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_set_shared 0.3456ms 49.8045μs 20.0785 KOps/s 19.6626 KOps/s $\color{#35bf28}+2.12\%$
test_update 0.2031ms 47.7122μs 20.9590 KOps/s 18.9553 KOps/s $\textbf{\color{#35bf28}+10.57\%}$
test_update_nested 0.2286ms 56.1285μs 17.8163 KOps/s 17.0954 KOps/s $\color{#35bf28}+4.22\%$
test_update__nested 0.2501ms 61.4439μs 16.2750 KOps/s 17.2737 KOps/s $\textbf{\color{#d91a1a}-5.78\%}$
test_set_nested 0.2234ms 41.3900μs 24.1604 KOps/s 22.6466 KOps/s $\textbf{\color{#35bf28}+6.68\%}$
test_set_nested_new 0.1943ms 45.0798μs 22.1829 KOps/s 20.9633 KOps/s $\textbf{\color{#35bf28}+5.82\%}$
test_select 0.2117ms 57.8807μs 17.2769 KOps/s 16.3944 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_select_nested 0.5079ms 42.8252μs 23.3508 KOps/s 23.8097 KOps/s $\color{#d91a1a}-1.93\%$
test_exclude_nested 91.0520μs 58.1747μs 17.1896 KOps/s 17.2558 KOps/s $\color{#d91a1a}-0.38\%$
test_empty[True] 0.3533ms 0.2390ms 4.1849 KOps/s 4.1660 KOps/s $\color{#35bf28}+0.45\%$
test_empty[False] 3.4011μs 0.7538μs 1.3266 MOps/s 1.3161 MOps/s $\color{#35bf28}+0.79\%$
test_to 0.1310ms 25.1156μs 39.8160 KOps/s 40.2259 KOps/s $\color{#d91a1a}-1.02\%$
test_to_nonblocking 0.1390ms 23.7404μs 42.1222 KOps/s 40.8327 KOps/s $\color{#35bf28}+3.16\%$
test_unbind_speed 0.3280ms 0.2797ms 3.5759 KOps/s 3.5676 KOps/s $\color{#35bf28}+0.23\%$
test_unbind_speed_stack0 0.3154ms 0.2740ms 3.6498 KOps/s 3.7407 KOps/s $\color{#d91a1a}-2.43\%$
test_unbind_speed_stack1 94.5170ms 0.7155ms 1.3976 KOps/s 1.4106 KOps/s $\color{#d91a1a}-0.92\%$
test_split 95.4272ms 2.1688ms 461.0813 Ops/s 454.9413 Ops/s $\color{#35bf28}+1.35\%$
test_chunk 96.5023ms 2.1590ms 463.1668 Ops/s 457.0549 Ops/s $\color{#35bf28}+1.34\%$
test_creation[device0] 0.3449ms 0.1244ms 8.0409 KOps/s 7.9992 KOps/s $\color{#35bf28}+0.52\%$
test_creation_from_tensor 0.4003ms 0.1275ms 7.8423 KOps/s 7.8856 KOps/s $\color{#d91a1a}-0.55\%$
test_add_one[memmap_tensor0] 0.1430ms 8.6606μs 115.4648 KOps/s 117.1629 KOps/s $\color{#d91a1a}-1.45\%$
test_contiguous[memmap_tensor0] 38.7410μs 2.1848μs 457.7039 KOps/s 464.0940 KOps/s $\color{#d91a1a}-1.38\%$
test_stack[memmap_tensor0] 0.1029ms 7.0079μs 142.6956 KOps/s 152.0219 KOps/s $\textbf{\color{#d91a1a}-6.13\%}$
test_memmaptd_index 1.0871ms 0.4209ms 2.3757 KOps/s 2.3867 KOps/s $\color{#d91a1a}-0.46\%$
test_memmaptd_index_astensor 0.9845ms 0.4659ms 2.1462 KOps/s 2.1123 KOps/s $\color{#35bf28}+1.60\%$
test_memmaptd_index_op 1.3719ms 0.9884ms 1.0118 KOps/s 962.4422 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_serialize_model 0.1309s 0.1302s 7.6822 Ops/s 7.7465 Ops/s $\color{#d91a1a}-0.83\%$
test_serialize_model_pickle 1.3468s 1.2121s 0.8250 Ops/s 0.8244 Ops/s $\color{#35bf28}+0.07\%$
test_serialize_weights 0.1300s 0.1292s 7.7373 Ops/s 7.7597 Ops/s $\color{#d91a1a}-0.29\%$
test_serialize_weights_returnearly 0.2439s 62.6031ms 15.9737 Ops/s 16.1307 Ops/s $\color{#d91a1a}-0.97\%$
test_serialize_weights_pickle 1.3824s 1.2180s 0.8210 Ops/s 0.8217 Ops/s $\color{#d91a1a}-0.09\%$
test_reshape_pytree 0.1229ms 35.1603μs 28.4412 KOps/s 27.9174 KOps/s $\color{#35bf28}+1.88\%$
test_reshape_td 0.1295ms 41.4474μs 24.1270 KOps/s 24.3349 KOps/s $\color{#d91a1a}-0.85\%$
test_view_pytree 0.1785ms 34.6882μs 28.8282 KOps/s 28.1085 KOps/s $\color{#35bf28}+2.56\%$
test_view_td 0.1760ms 46.3988μs 21.5523 KOps/s 22.0384 KOps/s $\color{#d91a1a}-2.21\%$
test_unbind_pytree 0.1872ms 34.2824μs 29.1695 KOps/s 28.5954 KOps/s $\color{#35bf28}+2.01\%$
test_unbind_td 0.3815ms 42.9875μs 23.2626 KOps/s 23.2781 KOps/s $\color{#d91a1a}-0.07\%$
test_split_pytree 0.1000ms 46.1063μs 21.6890 KOps/s 21.7390 KOps/s $\color{#d91a1a}-0.23\%$
test_split_td 0.7066ms 56.1331μs 17.8148 KOps/s 17.2616 KOps/s $\color{#35bf28}+3.20\%$
test_add_pytree 0.2097ms 56.1150μs 17.8206 KOps/s 17.4201 KOps/s $\color{#35bf28}+2.30\%$
test_add_td 0.2359ms 86.1236μs 11.6112 KOps/s 10.3555 KOps/s $\textbf{\color{#35bf28}+12.13\%}$
test_compile_add_one_nested[tensordict-compile] 0.4031ms 0.2081ms 4.8055 KOps/s 4.7375 KOps/s $\color{#35bf28}+1.43\%$
test_compile_add_one_nested[tensordict-eager] 0.3000ms 0.1482ms 6.7494 KOps/s 6.6841 KOps/s $\color{#35bf28}+0.98\%$
test_compile_add_one_nested[pytree-compile] 0.2967ms 0.1444ms 6.9247 KOps/s 6.9577 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_add_one_nested[pytree-eager] 0.5820ms 0.1828ms 5.4697 KOps/s 5.4090 KOps/s $\color{#35bf28}+1.12\%$
test_compile_copy_nested[tensordict-compile] 0.4230ms 21.2207μs 47.1238 KOps/s 47.2906 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_copy_nested[tensordict-eager] 0.1208ms 43.3247μs 23.0815 KOps/s 22.9613 KOps/s $\color{#35bf28}+0.52\%$
test_compile_copy_nested[pytree-compile] 0.4573ms 64.3625μs 15.5370 KOps/s 15.5795 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_copy_nested[pytree-eager] 0.4734ms 50.2075μs 19.9173 KOps/s 19.8828 KOps/s $\color{#35bf28}+0.17\%$
test_compile_add_one_flat[tensordict-compile] 0.4888ms 0.3158ms 3.1669 KOps/s 3.1734 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_add_one_flat[tensordict-eager] 0.6789ms 0.2069ms 4.8324 KOps/s 4.7850 KOps/s $\color{#35bf28}+0.99\%$
test_compile_add_one_flat[tensorclass-compile] 0.2803ms 0.1282ms 7.7986 KOps/s 7.8256 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_add_one_flat[tensorclass-eager] 0.5414ms 60.1752μs 16.6182 KOps/s 16.6567 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_add_one_flat[pytree-compile] 0.4928ms 0.3160ms 3.1642 KOps/s 3.1791 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_add_one_flat[pytree-eager] 1.0634ms 0.6216ms 1.6088 KOps/s 1.5891 KOps/s $\color{#35bf28}+1.24\%$
test_compile_add_self_flat[tensordict-eager] 0.6474ms 0.2468ms 4.0525 KOps/s 4.0198 KOps/s $\color{#35bf28}+0.81\%$
test_compile_add_self_flat[tensordict-compile] 0.7909ms 0.3166ms 3.1589 KOps/s 3.1628 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_add_self_flat[tensorclass-eager] 0.5038ms 69.7863μs 14.3294 KOps/s 14.2314 KOps/s $\color{#35bf28}+0.69\%$
test_compile_add_self_flat[tensorclass-compile] 0.2800ms 0.1290ms 7.7519 KOps/s 7.8231 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_add_self_flat[pytree-eager] 0.9551ms 0.5305ms 1.8852 KOps/s 1.8680 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_self_flat[pytree-compile] 0.7690ms 0.3160ms 3.1646 KOps/s 3.1787 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_copy_flat[tensordict-compile] 0.4452ms 18.3696μs 54.4379 KOps/s 55.7231 KOps/s $\color{#d91a1a}-2.31\%$
test_compile_copy_flat[tensordict-eager] 0.4885ms 27.2040μs 36.7593 KOps/s 36.8990 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_copy_flat[pytree-compile] 0.4666ms 69.2668μs 14.4369 KOps/s 14.5015 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_copy_flat[pytree-eager] 0.4751ms 50.6721μs 19.7347 KOps/s 19.4970 KOps/s $\color{#35bf28}+1.22\%$
test_compile_assign_and_add[tensordict-compile] 2.2880ms 0.7961ms 1.2562 KOps/s 1.1587 KOps/s $\textbf{\color{#35bf28}+8.42\%}$
test_compile_assign_and_add[tensordict-eager] 3.3780ms 3.1663ms 315.8258 Ops/s 315.8193 Ops/s $+0.00\%$
test_compile_assign_and_add[pytree-compile] 2.2628ms 0.7914ms 1.2637 KOps/s 1.1675 KOps/s $\textbf{\color{#35bf28}+8.23\%}$
test_compile_assign_and_add[pytree-eager] 3.8064ms 3.2262ms 309.9650 Ops/s 313.2393 Ops/s $\color{#d91a1a}-1.05\%$
test_compile_indexing[tensor-tensordict-compile] 0.2603ms 0.1094ms 9.1393 KOps/s 8.8513 KOps/s $\color{#35bf28}+3.25\%$
test_compile_indexing[tensor-tensordict-eager] 0.4991ms 62.7774μs 15.9293 KOps/s 15.5797 KOps/s $\color{#35bf28}+2.24\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2743ms 0.1074ms 9.3116 KOps/s 9.7533 KOps/s $\color{#d91a1a}-4.53\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2248ms 45.8154μs 21.8267 KOps/s 22.9725 KOps/s $\color{#d91a1a}-4.99\%$
test_compile_indexing[tensor-pytree-compile] 0.2938ms 0.1084ms 9.2288 KOps/s 9.6602 KOps/s $\color{#d91a1a}-4.47\%$
test_compile_indexing[tensor-pytree-eager] 0.2526ms 45.1751μs 22.1361 KOps/s 23.0284 KOps/s $\color{#d91a1a}-3.88\%$
test_compile_indexing[slice-tensordict-compile] 0.3523ms 0.1401ms 7.1373 KOps/s 7.3083 KOps/s $\color{#d91a1a}-2.34\%$
test_compile_indexing[slice-tensordict-eager] 0.2021ms 25.7267μs 38.8702 KOps/s 39.6723 KOps/s $\color{#d91a1a}-2.02\%$
test_compile_indexing[slice-tensorclass-compile] 0.6055ms 0.1308ms 7.6427 KOps/s 7.6497 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_indexing[slice-tensorclass-eager] 0.4191ms 20.8558μs 47.9484 KOps/s 48.4398 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[slice-pytree-compile] 0.5464ms 0.1356ms 7.3734 KOps/s 7.5845 KOps/s $\color{#d91a1a}-2.78\%$
test_compile_indexing[slice-pytree-eager] 0.4306ms 20.8162μs 48.0394 KOps/s 48.1452 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_indexing[int-tensordict-compile] 0.2859ms 0.1387ms 7.2099 KOps/s 7.1625 KOps/s $\color{#35bf28}+0.66\%$
test_compile_indexing[int-tensordict-eager] 0.5119ms 24.9210μs 40.1268 KOps/s 40.6596 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_indexing[int-tensorclass-compile] 0.5746ms 0.1315ms 7.6058 KOps/s 7.6093 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_indexing[int-tensorclass-eager] 0.4406ms 24.0815μs 41.5256 KOps/s 48.2875 KOps/s $\textbf{\color{#d91a1a}-14.00\%}$
test_compile_indexing[int-pytree-compile] 0.5237ms 0.1314ms 7.6098 KOps/s 7.5709 KOps/s $\color{#35bf28}+0.51\%$
test_compile_indexing[int-pytree-eager] 0.4159ms 20.7054μs 48.2966 KOps/s 48.7321 KOps/s $\color{#d91a1a}-0.89\%$
test_mod_add[eager] 0.4518ms 31.1907μs 32.0608 KOps/s 30.6893 KOps/s $\color{#35bf28}+4.47\%$
test_mod_add[compile] 0.5023ms 69.9057μs 14.3050 KOps/s 14.2486 KOps/s $\color{#35bf28}+0.40\%$
test_mod_add[compile-overhead] 0.2612ms 0.1354ms 7.3838 KOps/s 6.8628 KOps/s $\textbf{\color{#35bf28}+7.59\%}$
test_mod_wrap[eager] 0.6687ms 0.2346ms 4.2618 KOps/s 3.9115 KOps/s $\textbf{\color{#35bf28}+8.95\%}$
test_mod_wrap[compile] 0.5669ms 0.2907ms 3.4403 KOps/s 3.3696 KOps/s $\color{#35bf28}+2.10\%$
test_mod_wrap[compile-overhead] 7.4444ms 4.0440ms 247.2802 Ops/s 249.2020 Ops/s $\color{#d91a1a}-0.77\%$
test_mod_wrap_and_backward[eager] 1.5185ms 1.3405ms 746.0110 Ops/s 688.2731 Ops/s $\textbf{\color{#35bf28}+8.39\%}$
test_mod_wrap_and_backward[compile] 1.5269ms 1.3111ms 762.7236 Ops/s 698.7072 Ops/s $\textbf{\color{#35bf28}+9.16\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3703ms 0.9104ms 1.0984 KOps/s 991.2717 Ops/s $\textbf{\color{#35bf28}+10.81\%}$
test_seq_add[eager] 0.2585ms 94.5135μs 10.5805 KOps/s 10.1731 KOps/s $\color{#35bf28}+4.00\%$
test_seq_add[compile] 0.2527ms 81.8677μs 12.2148 KOps/s 12.6091 KOps/s $\color{#d91a1a}-3.13\%$
test_seq_add[compile-overhead] 0.2661ms 0.1154ms 8.6630 KOps/s 8.7997 KOps/s $\color{#d91a1a}-1.55\%$
test_seq_wrap[eager] 0.5371ms 0.3615ms 2.7664 KOps/s 2.5825 KOps/s $\textbf{\color{#35bf28}+7.12\%}$
test_seq_wrap[compile] 0.5111ms 0.3106ms 3.2193 KOps/s 3.1849 KOps/s $\color{#35bf28}+1.08\%$
test_seq_wrap[compile-overhead] 0.3944ms 0.2188ms 4.5704 KOps/s 4.5848 KOps/s $\color{#d91a1a}-0.32\%$
test_func_call_runtime[False-eager] 0.8889ms 0.7279ms 1.3739 KOps/s 1.3450 KOps/s $\color{#35bf28}+2.15\%$
test_func_call_runtime[False-compile] 0.9500ms 0.7864ms 1.2716 KOps/s 1.2662 KOps/s $\color{#35bf28}+0.42\%$
test_func_call_runtime[False-compile-overhead] 0.4806ms 0.3584ms 2.7905 KOps/s 2.7947 KOps/s $\color{#d91a1a}-0.15\%$
test_func_call_runtime[True-eager] 1.0571ms 0.8908ms 1.1226 KOps/s 1.1078 KOps/s $\color{#35bf28}+1.33\%$
test_func_call_runtime[True-compile] 0.9822ms 0.8266ms 1.2098 KOps/s 1.1931 KOps/s $\color{#35bf28}+1.41\%$
test_func_call_runtime[True-compile-overhead] 0.5343ms 0.3912ms 2.5560 KOps/s 2.5578 KOps/s $\color{#d91a1a}-0.07\%$
test_func_call_cm_runtime[False-eager] 0.8669ms 0.7181ms 1.3926 KOps/s 1.2806 KOps/s $\textbf{\color{#35bf28}+8.74\%}$
test_func_call_cm_runtime[False-compile] 1.0370ms 0.7855ms 1.2731 KOps/s 1.2558 KOps/s $\color{#35bf28}+1.38\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5075ms 0.3604ms 2.7750 KOps/s 2.7946 KOps/s $\color{#d91a1a}-0.70\%$
test_func_call_cm_runtime[True-eager] 1.1493ms 0.9900ms 1.0101 KOps/s 1.0026 KOps/s $\color{#35bf28}+0.76\%$
test_func_call_cm_runtime[True-compile] 1.0163ms 0.8513ms 1.1747 KOps/s 1.1698 KOps/s $\color{#35bf28}+0.41\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5630ms 0.4159ms 2.4043 KOps/s 2.3860 KOps/s $\color{#35bf28}+0.77\%$
test_vmap_func_call_cm_runtime[eager] 2.4852ms 2.0439ms 489.2607 Ops/s 483.3691 Ops/s $\color{#35bf28}+1.22\%$
test_vmap_func_call_cm_runtime[compile] 1.0468ms 0.8645ms 1.1568 KOps/s 1.1034 KOps/s $\color{#35bf28}+4.84\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5865ms 0.4230ms 2.3643 KOps/s 2.3802 KOps/s $\color{#d91a1a}-0.67\%$
test_distributed 2.8341ms 0.1795ms 5.5700 KOps/s 8.6588 KOps/s $\textbf{\color{#d91a1a}-35.67\%}$
test_tdmodule 24.5810μs 13.6059μs 73.4977 KOps/s 58.6015 KOps/s $\textbf{\color{#35bf28}+25.42\%}$
test_tdmodule_dispatch 0.1248ms 27.6294μs 36.1933 KOps/s 30.8807 KOps/s $\textbf{\color{#35bf28}+17.20\%}$
test_tdseq 33.6210μs 14.4020μs 69.4347 KOps/s 56.6757 KOps/s $\textbf{\color{#35bf28}+22.51\%}$
test_tdseq_dispatch 60.1510μs 29.3054μs 34.1234 KOps/s 28.0713 KOps/s $\textbf{\color{#35bf28}+21.56\%}$
test_instantiation_functorch 1.9682ms 1.8045ms 554.1852 Ops/s 538.2507 Ops/s $\color{#35bf28}+2.96\%$
test_instantiation_td 1.7655ms 1.1747ms 851.2808 Ops/s 842.6896 Ops/s $\color{#35bf28}+1.02\%$
test_exec_functorch 0.3555ms 0.2040ms 4.9018 KOps/s 4.8092 KOps/s $\color{#35bf28}+1.93\%$
test_exec_functional_call 0.3685ms 0.2062ms 4.8499 KOps/s 4.7966 KOps/s $\color{#35bf28}+1.11\%$
test_exec_td 0.3606ms 0.2076ms 4.8180 KOps/s 4.6778 KOps/s $\color{#35bf28}+3.00\%$
test_exec_td_decorator 0.9459ms 0.2490ms 4.0153 KOps/s 3.9052 KOps/s $\color{#35bf28}+2.82\%$
test_vmap_mlp_speed[True-True] 0.8854ms 0.7031ms 1.4223 KOps/s 1.4558 KOps/s $\color{#d91a1a}-2.30\%$
test_vmap_mlp_speed[True-False] 0.8288ms 0.6677ms 1.4976 KOps/s 1.4561 KOps/s $\color{#35bf28}+2.85\%$
test_vmap_mlp_speed[False-True] 0.7480ms 0.5869ms 1.7037 KOps/s 1.7522 KOps/s $\color{#d91a1a}-2.77\%$
test_vmap_mlp_speed[False-False] 0.7894ms 0.5873ms 1.7027 KOps/s 1.7395 KOps/s $\color{#d91a1a}-2.12\%$
test_vmap_mlp_speed_decorator[True-True] 0.8600ms 0.6673ms 1.4987 KOps/s 1.4899 KOps/s $\color{#35bf28}+0.59\%$
test_vmap_mlp_speed_decorator[True-False] 0.9755ms 0.6784ms 1.4741 KOps/s 1.4865 KOps/s $\color{#d91a1a}-0.83\%$
test_vmap_mlp_speed_decorator[False-True] 0.7845ms 0.5906ms 1.6931 KOps/s 1.6597 KOps/s $\color{#35bf28}+2.01\%$
test_vmap_mlp_speed_decorator[False-False] 0.7427ms 0.5811ms 1.7210 KOps/s 1.7076 KOps/s $\color{#35bf28}+0.79\%$
test_vmap_transformer_speed[True-True] 8.4595ms 8.2950ms 120.5545 Ops/s 119.1847 Ops/s $\color{#35bf28}+1.15\%$
test_vmap_transformer_speed[True-False] 8.4254ms 8.2614ms 121.0454 Ops/s 119.2929 Ops/s $\color{#35bf28}+1.47\%$
test_vmap_transformer_speed[False-True] 8.5172ms 8.0722ms 123.8824 Ops/s 122.2765 Ops/s $\color{#35bf28}+1.31\%$
test_vmap_transformer_speed[False-False] 8.3251ms 8.0825ms 123.7241 Ops/s 122.5102 Ops/s $\color{#35bf28}+0.99\%$
test_vmap_transformer_speed_decorator[True-True] 20.1908ms 19.3874ms 51.5799 Ops/s 51.2024 Ops/s $\color{#35bf28}+0.74\%$
test_vmap_transformer_speed_decorator[True-False] 20.1058ms 19.3591ms 51.6554 Ops/s 51.0241 Ops/s $\color{#35bf28}+1.24\%$
test_vmap_transformer_speed_decorator[False-True] 20.3343ms 19.2786ms 51.8710 Ops/s 51.5653 Ops/s $\color{#35bf28}+0.59\%$
test_vmap_transformer_speed_decorator[False-False] 19.3764ms 19.2352ms 51.9881 Ops/s 51.5128 Ops/s $\color{#35bf28}+0.92\%$
test_to_module_speed[True] 1.9631ms 0.9293ms 1.0761 KOps/s 1.0757 KOps/s $\color{#35bf28}+0.04\%$
test_to_module_speed[False] 1.0036ms 0.9047ms 1.1053 KOps/s 1.1009 KOps/s $\color{#35bf28}+0.40\%$
test_tc_init 66.5610μs 31.2642μs 31.9855 KOps/s 26.9122 KOps/s $\textbf{\color{#35bf28}+18.85\%}$
test_tc_init_nested 86.8310μs 60.7593μs 16.4584 KOps/s 13.0997 KOps/s $\textbf{\color{#35bf28}+25.64\%}$
test_tc_first_layer_tensor 26.7406μs 0.6749μs 1.4817 MOps/s 1.4624 MOps/s $\color{#35bf28}+1.32\%$
test_tc_first_layer_nontensor 0.1730ms 2.2214μs 450.1657 KOps/s 448.4904 KOps/s $\color{#35bf28}+0.37\%$
test_tc_second_layer_tensor 30.2955μs 1.3502μs 740.6051 KOps/s 729.1927 KOps/s $\color{#35bf28}+1.57\%$
test_tc_second_layer_nontensor 0.2054ms 2.9010μs 344.7057 KOps/s 341.5801 KOps/s $\color{#35bf28}+0.92\%$
test_unbind 0.1992s 11.2007ms 89.2798 Ops/s 91.2149 Ops/s $\color{#d91a1a}-2.12\%$
test_full_like 0.7587ms 0.5733ms 1.7444 KOps/s 1.7396 KOps/s $\color{#35bf28}+0.28\%$
test_zeros_like 0.3518ms 0.1983ms 5.0417 KOps/s 5.0410 KOps/s $\color{#35bf28}+0.01\%$
test_ones_like 0.3474ms 0.1981ms 5.0479 KOps/s 5.0466 KOps/s $\color{#35bf28}+0.03\%$
test_clone 0.5717ms 0.4151ms 2.4092 KOps/s 2.4157 KOps/s $\color{#d91a1a}-0.27\%$
test_squeeze 33.7310μs 9.8503μs 101.5197 KOps/s 102.6478 KOps/s $\color{#d91a1a}-1.10\%$
test_unsqueeze 0.2877ms 72.3038μs 13.8305 KOps/s 13.5723 KOps/s $\color{#35bf28}+1.90\%$
test_split 0.2579ms 0.1583ms 6.3182 KOps/s 6.4760 KOps/s $\color{#d91a1a}-2.44\%$
test_permute 0.3097ms 0.1749ms 5.7186 KOps/s 5.7779 KOps/s $\color{#d91a1a}-1.03\%$
test_stack 1.3490ms 0.8491ms 1.1777 KOps/s 1.1698 KOps/s $\color{#35bf28}+0.67\%$
test_cat 1.3595ms 1.2322ms 811.5864 Ops/s 811.5487 Ops/s $+0.00\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 9, 2024
ghstack-source-id: 6589677e9635efb3bcffb7e639b69e346008b09a
Pull Request resolved: #954
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 16, 2024
ghstack-source-id: 0cd65696a91d83674212ca9a62dce02d1cabf44d
Pull Request resolved: #954
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants