-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Remove list against list check during set #954
Open
vmoens
wants to merge
3
commits into
gh/vmoens/9/base
Choose a base branch
from
gh/vmoens/9/head
base: gh/vmoens/9/base
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Aug 9, 2024
ghstack-source-id: 9fb608e476b16a323ebbe8198e2ca94463007f58 Pull Request resolved: #954
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Aug 9, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 40.1150μs | 19.7383μs | 50.6629 KOps/s | 49.4665 KOps/s | |
test_plain_set_stack_nested | 46.4980μs | 20.2945μs | 49.2744 KOps/s | 49.3889 KOps/s | |
test_plain_set_nested_inplace | 53.3890μs | 21.4868μs | 46.5402 KOps/s | 46.1589 KOps/s | |
test_plain_set_stack_nested_inplace | 53.9020μs | 21.1621μs | 47.2543 KOps/s | 46.5852 KOps/s | |
test_items | 21.8310μs | 4.0399μs | 247.5284 KOps/s | 230.5434 KOps/s | |
test_items_nested | 0.7802ms | 0.3595ms | 2.7814 KOps/s | 2.7969 KOps/s | |
test_items_nested_locked | 0.5382ms | 0.3567ms | 2.8036 KOps/s | 2.7812 KOps/s | |
test_items_nested_leaf | 0.1307ms | 68.4107μs | 14.6176 KOps/s | 14.3081 KOps/s | |
test_items_stack_nested | 0.6691ms | 0.3617ms | 2.7646 KOps/s | 2.7581 KOps/s | |
test_items_stack_nested_leaf | 0.1701ms | 68.4680μs | 14.6054 KOps/s | 13.8080 KOps/s | |
test_items_stack_nested_locked | 0.5432ms | 0.3598ms | 2.7791 KOps/s | 2.7542 KOps/s | |
test_keys | 22.4120μs | 3.6950μs | 270.6362 KOps/s | 281.3704 KOps/s | |
test_keys_nested | 0.2149ms | 0.1008ms | 9.9186 KOps/s | 9.9594 KOps/s | |
test_keys_nested_locked | 1.9710ms | 0.1049ms | 9.5312 KOps/s | 9.3676 KOps/s | |
test_keys_nested_leaf | 0.1724ms | 82.7978μs | 12.0776 KOps/s | 11.9943 KOps/s | |
test_keys_stack_nested | 0.1669ms | 0.1003ms | 9.9696 KOps/s | 9.9573 KOps/s | |
test_keys_stack_nested_leaf | 0.1462ms | 82.4668μs | 12.1261 KOps/s | 12.1244 KOps/s | |
test_keys_stack_nested_locked | 0.2150ms | 0.1065ms | 9.3907 KOps/s | 9.5643 KOps/s | |
test_values | 12.9216μs | 1.0982μs | 910.5636 KOps/s | 900.0454 KOps/s | |
test_values_nested | 0.1196ms | 72.3659μs | 13.8187 KOps/s | 13.8428 KOps/s | |
test_values_nested_locked | 0.1328ms | 72.9082μs | 13.7159 KOps/s | 13.8559 KOps/s | |
test_values_nested_leaf | 0.1362ms | 61.4192μs | 16.2816 KOps/s | 16.0987 KOps/s | |
test_values_stack_nested | 0.1284ms | 73.5232μs | 13.6012 KOps/s | 13.9013 KOps/s | |
test_values_stack_nested_leaf | 0.1131ms | 61.9023μs | 16.1545 KOps/s | 16.9245 KOps/s | |
test_values_stack_nested_locked | 0.1308ms | 73.2136μs | 13.6587 KOps/s | 13.9002 KOps/s | |
test_membership | 14.2470μs | 0.8653μs | 1.1557 MOps/s | 1.1593 MOps/s | |
test_membership_nested | 18.2350μs | 2.7370μs | 365.3684 KOps/s | 354.4081 KOps/s | |
test_membership_nested_leaf | 27.9620μs | 2.7241μs | 367.0980 KOps/s | 361.5868 KOps/s | |
test_membership_stacked_nested | 26.2900μs | 2.7221μs | 367.3620 KOps/s | 360.2826 KOps/s | |
test_membership_stacked_nested_leaf | 26.2790μs | 2.7283μs | 366.5230 KOps/s | 356.9203 KOps/s | |
test_membership_nested_last | 24.9770μs | 3.9752μs | 251.5590 KOps/s | 248.8915 KOps/s | |
test_membership_nested_leaf_last | 27.3620μs | 3.9141μs | 255.4896 KOps/s | 250.7186 KOps/s | |
test_membership_stacked_nested_last | 26.1890μs | 3.9247μs | 254.7972 KOps/s | 218.3794 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.3260μs | 3.8868μs | 257.2782 KOps/s | 215.4700 KOps/s | |
test_nested_getleaf | 34.2640μs | 10.6373μs | 94.0088 KOps/s | 93.3134 KOps/s | |
test_nested_get | 31.6200μs | 10.1316μs | 98.7009 KOps/s | 98.0295 KOps/s | |
test_stacked_getleaf | 33.2630μs | 10.6685μs | 93.7343 KOps/s | 93.5372 KOps/s | |
test_stacked_get | 30.6270μs | 10.1489μs | 98.5333 KOps/s | 96.6882 KOps/s | |
test_nested_getitemleaf | 54.2020μs | 10.9607μs | 91.2352 KOps/s | 89.4542 KOps/s | |
test_nested_getitem | 30.9880μs | 10.1212μs | 98.8022 KOps/s | 97.1118 KOps/s | |
test_stacked_getitemleaf | 33.9840μs | 10.8612μs | 92.0706 KOps/s | 90.9896 KOps/s | |
test_stacked_getitem | 30.9080μs | 10.0860μs | 99.1474 KOps/s | 97.1304 KOps/s | |
test_lock_nested | 83.4084ms | 0.5750ms | 1.7392 KOps/s | 2.0848 KOps/s | |
test_lock_stack_nested | 0.6984ms | 0.4535ms | 2.2050 KOps/s | 2.2707 KOps/s | |
test_unlock_nested | 85.3088ms | 0.4933ms | 2.0273 KOps/s | 2.4705 KOps/s | |
test_unlock_stack_nested | 0.8051ms | 0.3708ms | 2.6971 KOps/s | 2.7645 KOps/s | |
test_flatten_speed | 0.1849ms | 89.9147μs | 11.1217 KOps/s | 11.3814 KOps/s | |
test_unflatten_speed | 1.0302ms | 0.4563ms | 2.1913 KOps/s | 2.1855 KOps/s | |
test_common_ops | 4.5597ms | 1.0761ms | 929.2457 Ops/s | 899.3570 Ops/s | |
test_creation | 38.5730μs | 2.1173μs | 472.2923 KOps/s | 479.8236 KOps/s | |
test_creation_empty | 45.6060μs | 16.4156μs | 60.9176 KOps/s | 57.4417 KOps/s | |
test_creation_nested_1 | 54.2920μs | 19.7473μs | 50.6399 KOps/s | 50.0200 KOps/s | |
test_creation_nested_2 | 78.8280μs | 24.0381μs | 41.6007 KOps/s | 40.6807 KOps/s | |
test_clone | 65.4330μs | 16.7967μs | 59.5357 KOps/s | 59.7133 KOps/s | |
test_getitem[int] | 1.1702ms | 16.5771μs | 60.3242 KOps/s | 61.7555 KOps/s | |
test_getitem[slice_int] | 0.1585ms | 30.9906μs | 32.2679 KOps/s | 33.1283 KOps/s | |
test_getitem[range] | 0.2089ms | 56.8147μs | 17.6011 KOps/s | 17.5401 KOps/s | |
test_getitem[tuple] | 0.1416ms | 25.7417μs | 38.8475 KOps/s | 40.9821 KOps/s | |
test_getitem[list] | 0.1787ms | 52.3391μs | 19.1062 KOps/s | 18.9491 KOps/s | |
test_setitem_dim[int] | 55.6340μs | 32.2247μs | 31.0321 KOps/s | 31.1241 KOps/s | |
test_setitem_dim[slice_int] | 0.1441ms | 62.6223μs | 15.9687 KOps/s | 16.7338 KOps/s | |
test_setitem_dim[range] | 0.1359ms | 84.1587μs | 11.8823 KOps/s | 12.1824 KOps/s | |
test_setitem_dim[tuple] | 95.7400μs | 49.1811μs | 20.3330 KOps/s | 20.6509 KOps/s | |
test_setitem | 93.9060μs | 28.2929μs | 35.3446 KOps/s | 33.7670 KOps/s | |
test_set | 84.5690μs | 27.5319μs | 36.3215 KOps/s | 35.2861 KOps/s | |
test_set_shared | 1.2832ms | 0.2112ms | 4.7353 KOps/s | 4.7759 KOps/s | |
test_update | 0.1311ms | 34.2261μs | 29.2175 KOps/s | 28.7204 KOps/s | |
test_update_nested | 0.1187ms | 43.8452μs | 22.8075 KOps/s | 22.4017 KOps/s | |
test_update__nested | 86.2620μs | 33.8968μs | 29.5013 KOps/s | 29.2102 KOps/s | |
test_set_nested | 0.1273ms | 30.3037μs | 32.9992 KOps/s | 31.9697 KOps/s | |
test_set_nested_new | 0.1330ms | 35.2767μs | 28.3473 KOps/s | 27.5469 KOps/s | |
test_select | 0.1516ms | 52.4972μs | 19.0486 KOps/s | 18.9478 KOps/s | |
test_select_nested | 0.1149ms | 59.2944μs | 16.8650 KOps/s | 16.6737 KOps/s | |
test_exclude_nested | 0.1650ms | 75.0296μs | 13.3281 KOps/s | 13.2260 KOps/s | |
test_empty[True] | 0.4762ms | 0.3167ms | 3.1578 KOps/s | 3.1281 KOps/s | |
test_empty[False] | 9.8535μs | 1.1703μs | 854.4639 KOps/s | 816.8465 KOps/s | |
test_unbind_speed | 0.6077ms | 0.3053ms | 3.2750 KOps/s | 3.4478 KOps/s | |
test_unbind_speed_stack0 | 0.4739ms | 0.2974ms | 3.3628 KOps/s | 3.5477 KOps/s | |
test_unbind_speed_stack1 | 87.2242ms | 0.8038ms | 1.2442 KOps/s | 1.5362 KOps/s | |
test_split | 87.1550ms | 2.1521ms | 464.6529 Ops/s | 476.1876 Ops/s | |
test_chunk | 3.0461ms | 1.9780ms | 505.5665 Ops/s | 466.7409 Ops/s | |
test_creation[device0] | 0.3829ms | 0.1185ms | 8.4362 KOps/s | 8.7649 KOps/s | |
test_creation_from_tensor | 0.2904ms | 0.1166ms | 8.5762 KOps/s | 8.6368 KOps/s | |
test_add_one[memmap_tensor0] | 0.1734ms | 7.3864μs | 135.3832 KOps/s | 143.0983 KOps/s | |
test_contiguous[memmap_tensor0] | 24.0450μs | 1.8929μs | 528.3024 KOps/s | 532.4148 KOps/s | |
test_stack[memmap_tensor0] | 49.0420μs | 5.8472μs | 171.0223 KOps/s | 184.8916 KOps/s | |
test_memmaptd_index | 1.0773ms | 0.3933ms | 2.5425 KOps/s | 2.6050 KOps/s | |
test_memmaptd_index_astensor | 0.9287ms | 0.4709ms | 2.1235 KOps/s | 2.1617 KOps/s | |
test_memmaptd_index_op | 91.4908ms | 1.0751ms | 930.1229 Ops/s | 1.0144 KOps/s | |
test_serialize_model | 0.1230s | 0.1158s | 8.6391 Ops/s | 8.2147 Ops/s | |
test_serialize_model_pickle | 0.4587s | 0.3899s | 2.5645 Ops/s | 2.5436 Ops/s | |
test_serialize_weights | 0.1228s | 0.1153s | 8.6764 Ops/s | 8.6851 Ops/s | |
test_serialize_weights_returnearly | 0.2398s | 0.1687s | 5.9262 Ops/s | 6.2730 Ops/s | |
test_serialize_weights_pickle | 0.9513s | 0.6515s | 1.5350 Ops/s | 2.5138 Ops/s | |
test_serialize_weights_filesystem | 0.1450s | 0.1385s | 7.2192 Ops/s | 7.1314 Ops/s | |
test_serialize_model_filesystem | 0.2341s | 0.1516s | 6.5942 Ops/s | 6.6695 Ops/s | |
test_reshape_pytree | 0.1105ms | 38.9120μs | 25.6990 KOps/s | 25.8018 KOps/s | |
test_reshape_td | 83.5260μs | 44.0732μs | 22.6895 KOps/s | 22.0447 KOps/s | |
test_view_pytree | 84.6490μs | 37.9520μs | 26.3490 KOps/s | 26.1419 KOps/s | |
test_view_td | 0.1132ms | 49.8092μs | 20.0766 KOps/s | 20.1108 KOps/s | |
test_unbind_pytree | 94.0170μs | 35.6310μs | 28.0654 KOps/s | 27.8812 KOps/s | |
test_unbind_td | 0.3375ms | 44.9787μs | 22.2327 KOps/s | 23.1235 KOps/s | |
test_split_pytree | 83.8180μs | 37.6583μs | 26.5545 KOps/s | 27.0081 KOps/s | |
test_split_td | 0.4676ms | 56.4765μs | 17.7065 KOps/s | 17.7346 KOps/s | |
test_add_pytree | 0.1006ms | 44.3564μs | 22.5447 KOps/s | 22.5632 KOps/s | |
test_add_td | 0.1484ms | 80.3130μs | 12.4513 KOps/s | 12.4171 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1342ms | 59.1002μs | 16.9204 KOps/s | 17.5723 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2949ms | 0.1746ms | 5.7279 KOps/s | 5.6538 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1317ms | 57.3965μs | 17.4227 KOps/s | 17.6222 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2580ms | 0.1394ms | 7.1732 KOps/s | 7.2673 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 63.5690μs | 21.6542μs | 46.1803 KOps/s | 46.7066 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1566ms | 66.3453μs | 15.0727 KOps/s | 14.8424 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1250ms | 75.3266μs | 13.2755 KOps/s | 13.3163 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1357ms | 67.2917μs | 14.8607 KOps/s | 14.5351 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2379ms | 0.1739ms | 5.7504 KOps/s | 5.7786 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4189ms | 0.1902ms | 5.2583 KOps/s | 5.4181 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1054ms | 47.7872μs | 20.9261 KOps/s | 21.6364 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1509ms | 68.2788μs | 14.6458 KOps/s | 13.9306 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3360ms | 0.1731ms | 5.7783 KOps/s | 5.7399 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5858ms | 0.2874ms | 3.4790 KOps/s | 3.4738 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4320ms | 0.2070ms | 4.8300 KOps/s | 4.9238 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2941ms | 0.1747ms | 5.7245 KOps/s | 5.7158 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1418ms | 62.5742μs | 15.9810 KOps/s | 16.3850 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1373ms | 49.4281μs | 20.2314 KOps/s | 20.7620 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3521ms | 0.2355ms | 4.2469 KOps/s | 4.3147 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3324ms | 0.1759ms | 5.6859 KOps/s | 5.7417 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2015ms | 0.1022ms | 9.7831 KOps/s | 9.7824 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1220ms | 57.2601μs | 17.4642 KOps/s | 17.5926 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1667ms | 78.3834μs | 12.7578 KOps/s | 12.9804 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1491ms | 70.4844μs | 14.1875 KOps/s | 14.5183 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3119ms | 0.1947ms | 5.1353 KOps/s | 5.1127 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9035ms | 1.6716ms | 598.2164 Ops/s | 620.7128 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2674ms | 0.1933ms | 5.1729 KOps/s | 5.1863 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.8682ms | 1.1010ms | 908.2541 Ops/s | 920.9341 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5242ms | 0.4190ms | 2.3864 KOps/s | 2.3692 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.8409ms | 3.8134ms | 262.2363 Ops/s | 263.7058 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1017ms | 36.2895μs | 27.5562 KOps/s | 29.2975 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.9444ms | 47.6049μs | 21.0062 KOps/s | 21.2079 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 81.3620μs | 30.8010μs | 32.4665 KOps/s | 34.6148 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 95.2790μs | 28.3546μs | 35.2676 KOps/s | 35.4436 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 79.7800μs | 29.5814μs | 33.8050 KOps/s | 34.4073 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 78.8280μs | 28.0214μs | 35.6870 KOps/s | 36.3650 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1653ms | 74.4563μs | 13.4307 KOps/s | 13.7149 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5135ms | 27.9799μs | 35.7399 KOps/s | 37.2837 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1428ms | 68.3523μs | 14.6301 KOps/s | 14.9338 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 64.6410μs | 23.2476μs | 43.0151 KOps/s | 43.8955 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1475ms | 67.7079μs | 14.7693 KOps/s | 14.8495 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 84.5680μs | 23.2714μs | 42.9712 KOps/s | 44.5776 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1472ms | 74.1242μs | 13.4909 KOps/s | 13.8829 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9792ms | 27.4392μs | 36.4443 KOps/s | 38.1500 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1440ms | 68.2227μs | 14.6579 KOps/s | 14.9970 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 77.0350μs | 22.7952μs | 43.8688 KOps/s | 45.0027 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1301ms | 67.8075μs | 14.7476 KOps/s | 14.9349 KOps/s | |
test_compile_indexing[int-pytree-eager] | 62.5480μs | 23.2819μs | 42.9518 KOps/s | 44.8385 KOps/s | |
test_mod_add[eager] | 75.7930μs | 24.0139μs | 41.6425 KOps/s | 42.7124 KOps/s | |
test_mod_add[compile] | 0.1008ms | 38.7651μs | 25.7964 KOps/s | 26.3698 KOps/s | |
test_mod_add[compile-overhead] | 91.0810μs | 38.8575μs | 25.7351 KOps/s | 26.0766 KOps/s | |
test_mod_wrap[eager] | 0.4429ms | 0.2079ms | 4.8104 KOps/s | 4.9358 KOps/s | |
test_mod_wrap[compile] | 0.4581ms | 0.2344ms | 4.2655 KOps/s | 4.3054 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3418ms | 0.2309ms | 4.3308 KOps/s | 4.3035 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.2713ms | 10.6950ms | 93.5019 Ops/s | 92.5880 Ops/s | |
test_mod_wrap_and_backward[compile] | 11.6035ms | 10.7727ms | 92.8272 Ops/s | 93.6174 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.6803ms | 10.7957ms | 92.6295 Ops/s | 87.2737 Ops/s | |
test_seq_add[eager] | 0.1665ms | 88.6449μs | 11.2810 KOps/s | 11.6040 KOps/s | |
test_seq_add[compile] | 0.1417ms | 65.6393μs | 15.2348 KOps/s | 15.8737 KOps/s | |
test_seq_add[compile-overhead] | 0.1378ms | 63.7166μs | 15.6945 KOps/s | 15.7554 KOps/s | |
test_seq_wrap[eager] | 0.6874ms | 0.3788ms | 2.6397 KOps/s | 2.6980 KOps/s | |
test_seq_wrap[compile] | 1.3432ms | 0.2714ms | 3.6850 KOps/s | 3.6872 KOps/s | |
test_seq_wrap[compile-overhead] | 1.2683ms | 0.2679ms | 3.7331 KOps/s | 3.6760 KOps/s | |
test_func_call_runtime[False-eager] | 1.1731ms | 0.5180ms | 1.9303 KOps/s | 1.9782 KOps/s | |
test_func_call_runtime[False-compile] | 0.6940ms | 0.5009ms | 1.9965 KOps/s | 2.0122 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.9481ms | 0.5018ms | 1.9929 KOps/s | 2.0027 KOps/s | |
test_func_call_runtime[True-eager] | 1.0545ms | 0.7425ms | 1.3469 KOps/s | 1.3789 KOps/s | |
test_func_call_runtime[True-compile] | 0.7843ms | 0.5149ms | 1.9423 KOps/s | 1.9605 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9108ms | 0.5140ms | 1.9455 KOps/s | 1.9474 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.6257ms | 0.5055ms | 1.9784 KOps/s | 2.0186 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6120ms | 0.5022ms | 1.9914 KOps/s | 1.9979 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9074ms | 0.5003ms | 1.9988 KOps/s | 1.9817 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1836ms | 0.8527ms | 1.1728 KOps/s | 1.1882 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2212ms | 0.7398ms | 1.3516 KOps/s | 1.3636 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8507ms | 0.7379ms | 1.3551 KOps/s | 1.3691 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4606ms | 1.8739ms | 533.6467 Ops/s | 540.5956 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 3.1423ms | 1.9268ms | 518.9826 Ops/s | 532.1211 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6259ms | 1.9123ms | 522.9173 Ops/s | 531.4000 Ops/s | |
test_distributed | 0.2486ms | 0.1240ms | 8.0641 KOps/s | 7.9505 KOps/s | |
test_tdmodule | 68.6090μs | 16.3478μs | 61.1703 KOps/s | 58.7542 KOps/s | |
test_tdmodule_dispatch | 52.5290μs | 33.8422μs | 29.5489 KOps/s | 27.5501 KOps/s | |
test_tdseq | 35.4260μs | 19.5187μs | 51.2328 KOps/s | 48.8280 KOps/s | |
test_tdseq_dispatch | 74.7600μs | 39.7569μs | 25.1529 KOps/s | 24.5698 KOps/s | |
test_instantiation_functorch | 1.7537ms | 1.5367ms | 650.7400 Ops/s | 629.1895 Ops/s | |
test_instantiation_td | 1.7789ms | 1.1467ms | 872.0914 Ops/s | 856.1449 Ops/s | |
test_exec_functorch | 0.3234ms | 0.1834ms | 5.4512 KOps/s | 5.4747 KOps/s | |
test_exec_functional_call | 0.3242ms | 0.1705ms | 5.8634 KOps/s | 5.7526 KOps/s | |
test_exec_td | 0.2282ms | 0.1652ms | 6.0534 KOps/s | 6.1184 KOps/s | |
test_exec_td_decorator | 0.5304ms | 0.2185ms | 4.5769 KOps/s | 4.5922 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8112ms | 0.6415ms | 1.5589 KOps/s | 1.5824 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9295ms | 0.6402ms | 1.5621 KOps/s | 1.5896 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.5947ms | 0.4957ms | 2.0173 KOps/s | 2.0494 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6404ms | 0.4994ms | 2.0024 KOps/s | 2.0526 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2810ms | 0.6255ms | 1.5986 KOps/s | 1.6368 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8506ms | 0.6248ms | 1.6004 KOps/s | 1.6316 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9170ms | 0.5174ms | 1.9327 KOps/s | 1.9902 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7535ms | 0.5174ms | 1.9326 KOps/s | 1.9869 KOps/s | |
test_to_module_speed[True] | 1.4744ms | 1.2706ms | 787.0093 Ops/s | 779.1970 Ops/s | |
test_to_module_speed[False] | 1.7253ms | 1.2453ms | 802.9993 Ops/s | 797.9852 Ops/s | |
test_tc_init | 77.4350μs | 42.7406μs | 23.3970 KOps/s | 22.8959 KOps/s | |
test_tc_init_nested | 0.1637ms | 83.6571μs | 11.9536 KOps/s | 11.3049 KOps/s | |
test_tc_first_layer_tensor | 17.6530μs | 1.5743μs | 635.2119 KOps/s | 670.2046 KOps/s | |
test_tc_first_layer_nontensor | 27.6410μs | 4.7234μs | 211.7140 KOps/s | 214.7977 KOps/s | |
test_tc_second_layer_tensor | 28.3430μs | 2.8386μs | 352.2870 KOps/s | 360.6663 KOps/s | |
test_tc_second_layer_nontensor | 30.1560μs | 6.0971μs | 164.0134 KOps/s | 169.6579 KOps/s | |
test_unbind | 0.4697s | 13.0217ms | 76.7949 Ops/s | 61.0307 Ops/s | |
test_full_like | 8.2763ms | 6.9762ms | 143.3447 Ops/s | 143.0847 Ops/s | |
test_zeros_like | 3.0897ms | 2.6976ms | 370.7015 Ops/s | 154.0677 Ops/s | |
test_ones_like | 3.5594ms | 3.1534ms | 317.1162 Ops/s | 132.6648 Ops/s | |
test_clone | 5.6861ms | 5.1173ms | 195.4159 Ops/s | 108.3076 Ops/s | |
test_squeeze | 59.7620μs | 12.5986μs | 79.3738 KOps/s | 83.3463 KOps/s | |
test_unsqueeze | 0.1548ms | 91.4849μs | 10.9308 KOps/s | 11.1006 KOps/s | |
test_split | 0.5841ms | 0.1929ms | 5.1836 KOps/s | 5.1563 KOps/s | |
test_permute | 0.4502ms | 0.2185ms | 4.5764 KOps/s | 4.6533 KOps/s | |
test_stack | 32.5944ms | 26.5675ms | 37.6399 Ops/s | 39.3647 Ops/s | |
test_cat | 31.4787ms | 26.1053ms | 38.3064 Ops/s | 38.9534 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1250ms | 13.2364μs | 75.5492 KOps/s | 65.8566 KOps/s | |
test_plain_set_stack_nested | 73.3020μs | 13.2481μs | 75.4826 KOps/s | 66.7855 KOps/s | |
test_plain_set_nested_inplace | 42.7810μs | 14.3277μs | 69.7951 KOps/s | 61.3319 KOps/s | |
test_plain_set_stack_nested_inplace | 47.4410μs | 14.3021μs | 69.9200 KOps/s | 61.9321 KOps/s | |
test_items | 23.5900μs | 2.8730μs | 348.0636 KOps/s | 342.9892 KOps/s | |
test_items_nested | 0.3817ms | 0.3234ms | 3.0917 KOps/s | 3.0501 KOps/s | |
test_items_nested_locked | 0.3604ms | 0.3271ms | 3.0569 KOps/s | 3.0550 KOps/s | |
test_items_nested_leaf | 94.9020μs | 55.5071μs | 18.0157 KOps/s | 17.9417 KOps/s | |
test_items_stack_nested | 0.3662ms | 0.3245ms | 3.0816 KOps/s | 3.0634 KOps/s | |
test_items_stack_nested_leaf | 0.1737ms | 57.6431μs | 17.3481 KOps/s | 17.6028 KOps/s | |
test_items_stack_nested_locked | 0.4028ms | 0.3237ms | 3.0888 KOps/s | 3.0529 KOps/s | |
test_keys | 23.9000μs | 3.4331μs | 291.2834 KOps/s | 292.4002 KOps/s | |
test_keys_nested | 89.6620μs | 55.8858μs | 17.8936 KOps/s | 17.7629 KOps/s | |
test_keys_nested_locked | 2.5901ms | 61.8303μs | 16.1733 KOps/s | 16.0611 KOps/s | |
test_keys_nested_leaf | 75.7120μs | 47.2920μs | 21.1452 KOps/s | 20.9677 KOps/s | |
test_keys_stack_nested | 80.9020μs | 55.9039μs | 17.8878 KOps/s | 17.8435 KOps/s | |
test_keys_stack_nested_leaf | 0.2127ms | 48.0721μs | 20.8021 KOps/s | 20.7951 KOps/s | |
test_keys_stack_nested_locked | 86.5620μs | 61.2300μs | 16.3319 KOps/s | 16.4103 KOps/s | |
test_values | 19.7835μs | 0.8422μs | 1.1874 MOps/s | 1.1823 MOps/s | |
test_values_nested | 67.1010μs | 40.6904μs | 24.5758 KOps/s | 24.5629 KOps/s | |
test_values_nested_locked | 75.2520μs | 42.5573μs | 23.4977 KOps/s | 23.4911 KOps/s | |
test_values_nested_leaf | 99.6120μs | 34.9466μs | 28.6151 KOps/s | 28.2941 KOps/s | |
test_values_stack_nested | 97.0120μs | 41.3036μs | 24.2110 KOps/s | 24.0815 KOps/s | |
test_values_stack_nested_leaf | 60.6020μs | 35.6004μs | 28.0896 KOps/s | 28.0298 KOps/s | |
test_values_stack_nested_locked | 0.1646ms | 43.2372μs | 23.1283 KOps/s | 22.8627 KOps/s | |
test_membership | 1.6251μs | 0.5181μs | 1.9301 MOps/s | 1.9659 MOps/s | |
test_membership_nested | 16.5055μs | 1.8679μs | 535.3643 KOps/s | 533.1051 KOps/s | |
test_membership_nested_leaf | 11.5237μs | 1.8428μs | 542.6513 KOps/s | 554.1728 KOps/s | |
test_membership_stacked_nested | 30.1300μs | 1.8849μs | 530.5445 KOps/s | 524.7403 KOps/s | |
test_membership_stacked_nested_leaf | 22.5110μs | 1.8918μs | 528.5969 KOps/s | 524.7917 KOps/s | |
test_membership_nested_last | 23.3910μs | 2.7692μs | 361.1140 KOps/s | 361.7588 KOps/s | |
test_membership_nested_leaf_last | 27.2500μs | 2.7463μs | 364.1267 KOps/s | 356.9117 KOps/s | |
test_membership_stacked_nested_last | 33.3810μs | 4.1650μs | 240.0958 KOps/s | 126.9178 KOps/s | |
test_membership_stacked_nested_leaf_last | 32.4810μs | 4.1661μs | 240.0343 KOps/s | 127.6834 KOps/s | |
test_nested_getleaf | 91.3920μs | 6.1048μs | 163.8044 KOps/s | 163.6844 KOps/s | |
test_nested_get | 36.3110μs | 5.6836μs | 175.9443 KOps/s | 172.2945 KOps/s | |
test_stacked_getleaf | 47.4410μs | 6.0234μs | 166.0193 KOps/s | 165.0052 KOps/s | |
test_stacked_get | 37.4210μs | 5.7073μs | 175.2135 KOps/s | 175.4099 KOps/s | |
test_nested_getitemleaf | 31.5210μs | 6.1418μs | 162.8196 KOps/s | 162.4786 KOps/s | |
test_nested_getitem | 30.6400μs | 5.7451μs | 174.0610 KOps/s | 172.2221 KOps/s | |
test_stacked_getitemleaf | 30.7910μs | 6.1060μs | 163.7745 KOps/s | 163.2915 KOps/s | |
test_stacked_getitem | 35.7210μs | 5.8256μs | 171.6557 KOps/s | 175.7970 KOps/s | |
test_lock_nested | 4.6245ms | 0.4189ms | 2.3871 KOps/s | 2.3990 KOps/s | |
test_lock_stack_nested | 0.4085ms | 0.3776ms | 2.6481 KOps/s | 2.7039 KOps/s | |
test_unlock_nested | 0.7467ms | 0.3541ms | 2.8240 KOps/s | 2.8256 KOps/s | |
test_unlock_stack_nested | 0.4169ms | 0.3163ms | 3.1618 KOps/s | 3.2479 KOps/s | |
test_flatten_speed | 0.1005ms | 68.7900μs | 14.5370 KOps/s | 14.1737 KOps/s | |
test_unflatten_speed | 0.3145ms | 0.2741ms | 3.6489 KOps/s | 3.5974 KOps/s | |
test_common_ops | 1.5053ms | 1.1991ms | 833.9594 Ops/s | 774.8681 Ops/s | |
test_creation | 20.9000μs | 1.4828μs | 674.4181 KOps/s | 669.0362 KOps/s | |
test_creation_empty | 55.2010μs | 14.1384μs | 70.7295 KOps/s | 57.5185 KOps/s | |
test_creation_nested_1 | 51.1320μs | 15.8475μs | 63.1016 KOps/s | 52.2314 KOps/s | |
test_creation_nested_2 | 46.2910μs | 18.9768μs | 52.6960 KOps/s | 46.0618 KOps/s | |
test_clone | 1.2965ms | 29.4044μs | 34.0085 KOps/s | 34.8103 KOps/s | |
test_getitem[int] | 94.8104ms | 23.2878μs | 42.9410 KOps/s | 62.8846 KOps/s | |
test_getitem[slice_int] | 0.1265ms | 27.3390μs | 36.5778 KOps/s | 34.5945 KOps/s | |
test_getitem[range] | 0.2244ms | 0.1088ms | 9.1938 KOps/s | 9.1292 KOps/s | |
test_getitem[tuple] | 0.1709ms | 23.5668μs | 42.4326 KOps/s | 42.2708 KOps/s | |
test_getitem[list] | 0.2714ms | 97.6893μs | 10.2365 KOps/s | 10.1830 KOps/s | |
test_setitem_dim[int] | 0.1866ms | 44.2963μs | 22.5752 KOps/s | 22.5472 KOps/s | |
test_setitem_dim[slice_int] | 0.2099ms | 67.0308μs | 14.9185 KOps/s | 14.6755 KOps/s | |
test_setitem_dim[range] | 0.1610ms | 0.1265ms | 7.9080 KOps/s | 7.5296 KOps/s | |
test_setitem_dim[tuple] | 0.1767ms | 60.1995μs | 16.6114 KOps/s | 15.7025 KOps/s | |
test_setitem | 0.1915ms | 41.1951μs | 24.2747 KOps/s | 22.8983 KOps/s | |
test_set | 0.1922ms | 39.4272μs | 25.3632 KOps/s | 23.7530 KOps/s | |
test_set_shared | 0.3456ms | 49.8045μs | 20.0785 KOps/s | 19.6626 KOps/s | |
test_update | 0.2031ms | 47.7122μs | 20.9590 KOps/s | 18.9553 KOps/s | |
test_update_nested | 0.2286ms | 56.1285μs | 17.8163 KOps/s | 17.0954 KOps/s | |
test_update__nested | 0.2501ms | 61.4439μs | 16.2750 KOps/s | 17.2737 KOps/s | |
test_set_nested | 0.2234ms | 41.3900μs | 24.1604 KOps/s | 22.6466 KOps/s | |
test_set_nested_new | 0.1943ms | 45.0798μs | 22.1829 KOps/s | 20.9633 KOps/s | |
test_select | 0.2117ms | 57.8807μs | 17.2769 KOps/s | 16.3944 KOps/s | |
test_select_nested | 0.5079ms | 42.8252μs | 23.3508 KOps/s | 23.8097 KOps/s | |
test_exclude_nested | 91.0520μs | 58.1747μs | 17.1896 KOps/s | 17.2558 KOps/s | |
test_empty[True] | 0.3533ms | 0.2390ms | 4.1849 KOps/s | 4.1660 KOps/s | |
test_empty[False] | 3.4011μs | 0.7538μs | 1.3266 MOps/s | 1.3161 MOps/s | |
test_to | 0.1310ms | 25.1156μs | 39.8160 KOps/s | 40.2259 KOps/s | |
test_to_nonblocking | 0.1390ms | 23.7404μs | 42.1222 KOps/s | 40.8327 KOps/s | |
test_unbind_speed | 0.3280ms | 0.2797ms | 3.5759 KOps/s | 3.5676 KOps/s | |
test_unbind_speed_stack0 | 0.3154ms | 0.2740ms | 3.6498 KOps/s | 3.7407 KOps/s | |
test_unbind_speed_stack1 | 94.5170ms | 0.7155ms | 1.3976 KOps/s | 1.4106 KOps/s | |
test_split | 95.4272ms | 2.1688ms | 461.0813 Ops/s | 454.9413 Ops/s | |
test_chunk | 96.5023ms | 2.1590ms | 463.1668 Ops/s | 457.0549 Ops/s | |
test_creation[device0] | 0.3449ms | 0.1244ms | 8.0409 KOps/s | 7.9992 KOps/s | |
test_creation_from_tensor | 0.4003ms | 0.1275ms | 7.8423 KOps/s | 7.8856 KOps/s | |
test_add_one[memmap_tensor0] | 0.1430ms | 8.6606μs | 115.4648 KOps/s | 117.1629 KOps/s | |
test_contiguous[memmap_tensor0] | 38.7410μs | 2.1848μs | 457.7039 KOps/s | 464.0940 KOps/s | |
test_stack[memmap_tensor0] | 0.1029ms | 7.0079μs | 142.6956 KOps/s | 152.0219 KOps/s | |
test_memmaptd_index | 1.0871ms | 0.4209ms | 2.3757 KOps/s | 2.3867 KOps/s | |
test_memmaptd_index_astensor | 0.9845ms | 0.4659ms | 2.1462 KOps/s | 2.1123 KOps/s | |
test_memmaptd_index_op | 1.3719ms | 0.9884ms | 1.0118 KOps/s | 962.4422 Ops/s | |
test_serialize_model | 0.1309s | 0.1302s | 7.6822 Ops/s | 7.7465 Ops/s | |
test_serialize_model_pickle | 1.3468s | 1.2121s | 0.8250 Ops/s | 0.8244 Ops/s | |
test_serialize_weights | 0.1300s | 0.1292s | 7.7373 Ops/s | 7.7597 Ops/s | |
test_serialize_weights_returnearly | 0.2439s | 62.6031ms | 15.9737 Ops/s | 16.1307 Ops/s | |
test_serialize_weights_pickle | 1.3824s | 1.2180s | 0.8210 Ops/s | 0.8217 Ops/s | |
test_reshape_pytree | 0.1229ms | 35.1603μs | 28.4412 KOps/s | 27.9174 KOps/s | |
test_reshape_td | 0.1295ms | 41.4474μs | 24.1270 KOps/s | 24.3349 KOps/s | |
test_view_pytree | 0.1785ms | 34.6882μs | 28.8282 KOps/s | 28.1085 KOps/s | |
test_view_td | 0.1760ms | 46.3988μs | 21.5523 KOps/s | 22.0384 KOps/s | |
test_unbind_pytree | 0.1872ms | 34.2824μs | 29.1695 KOps/s | 28.5954 KOps/s | |
test_unbind_td | 0.3815ms | 42.9875μs | 23.2626 KOps/s | 23.2781 KOps/s | |
test_split_pytree | 0.1000ms | 46.1063μs | 21.6890 KOps/s | 21.7390 KOps/s | |
test_split_td | 0.7066ms | 56.1331μs | 17.8148 KOps/s | 17.2616 KOps/s | |
test_add_pytree | 0.2097ms | 56.1150μs | 17.8206 KOps/s | 17.4201 KOps/s | |
test_add_td | 0.2359ms | 86.1236μs | 11.6112 KOps/s | 10.3555 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4031ms | 0.2081ms | 4.8055 KOps/s | 4.7375 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3000ms | 0.1482ms | 6.7494 KOps/s | 6.6841 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2967ms | 0.1444ms | 6.9247 KOps/s | 6.9577 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.5820ms | 0.1828ms | 5.4697 KOps/s | 5.4090 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.4230ms | 21.2207μs | 47.1238 KOps/s | 47.2906 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1208ms | 43.3247μs | 23.0815 KOps/s | 22.9613 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4573ms | 64.3625μs | 15.5370 KOps/s | 15.5795 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4734ms | 50.2075μs | 19.9173 KOps/s | 19.8828 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4888ms | 0.3158ms | 3.1669 KOps/s | 3.1734 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6789ms | 0.2069ms | 4.8324 KOps/s | 4.7850 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2803ms | 0.1282ms | 7.7986 KOps/s | 7.8256 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5414ms | 60.1752μs | 16.6182 KOps/s | 16.6567 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4928ms | 0.3160ms | 3.1642 KOps/s | 3.1791 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 1.0634ms | 0.6216ms | 1.6088 KOps/s | 1.5891 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6474ms | 0.2468ms | 4.0525 KOps/s | 4.0198 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.7909ms | 0.3166ms | 3.1589 KOps/s | 3.1628 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.5038ms | 69.7863μs | 14.3294 KOps/s | 14.2314 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2800ms | 0.1290ms | 7.7519 KOps/s | 7.8231 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.9551ms | 0.5305ms | 1.8852 KOps/s | 1.8680 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.7690ms | 0.3160ms | 3.1646 KOps/s | 3.1787 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.4452ms | 18.3696μs | 54.4379 KOps/s | 55.7231 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.4885ms | 27.2040μs | 36.7593 KOps/s | 36.8990 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.4666ms | 69.2668μs | 14.4369 KOps/s | 14.5015 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.4751ms | 50.6721μs | 19.7347 KOps/s | 19.4970 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.2880ms | 0.7961ms | 1.2562 KOps/s | 1.1587 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.3780ms | 3.1663ms | 315.8258 Ops/s | 315.8193 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.2628ms | 0.7914ms | 1.2637 KOps/s | 1.1675 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.8064ms | 3.2262ms | 309.9650 Ops/s | 313.2393 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2603ms | 0.1094ms | 9.1393 KOps/s | 8.8513 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.4991ms | 62.7774μs | 15.9293 KOps/s | 15.5797 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2743ms | 0.1074ms | 9.3116 KOps/s | 9.7533 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2248ms | 45.8154μs | 21.8267 KOps/s | 22.9725 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2938ms | 0.1084ms | 9.2288 KOps/s | 9.6602 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2526ms | 45.1751μs | 22.1361 KOps/s | 23.0284 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.3523ms | 0.1401ms | 7.1373 KOps/s | 7.3083 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2021ms | 25.7267μs | 38.8702 KOps/s | 39.6723 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.6055ms | 0.1308ms | 7.6427 KOps/s | 7.6497 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.4191ms | 20.8558μs | 47.9484 KOps/s | 48.4398 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.5464ms | 0.1356ms | 7.3734 KOps/s | 7.5845 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.4306ms | 20.8162μs | 48.0394 KOps/s | 48.1452 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2859ms | 0.1387ms | 7.2099 KOps/s | 7.1625 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5119ms | 24.9210μs | 40.1268 KOps/s | 40.6596 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.5746ms | 0.1315ms | 7.6058 KOps/s | 7.6093 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.4406ms | 24.0815μs | 41.5256 KOps/s | 48.2875 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.5237ms | 0.1314ms | 7.6098 KOps/s | 7.5709 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.4159ms | 20.7054μs | 48.2966 KOps/s | 48.7321 KOps/s | |
test_mod_add[eager] | 0.4518ms | 31.1907μs | 32.0608 KOps/s | 30.6893 KOps/s | |
test_mod_add[compile] | 0.5023ms | 69.9057μs | 14.3050 KOps/s | 14.2486 KOps/s | |
test_mod_add[compile-overhead] | 0.2612ms | 0.1354ms | 7.3838 KOps/s | 6.8628 KOps/s | |
test_mod_wrap[eager] | 0.6687ms | 0.2346ms | 4.2618 KOps/s | 3.9115 KOps/s | |
test_mod_wrap[compile] | 0.5669ms | 0.2907ms | 3.4403 KOps/s | 3.3696 KOps/s | |
test_mod_wrap[compile-overhead] | 7.4444ms | 4.0440ms | 247.2802 Ops/s | 249.2020 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5185ms | 1.3405ms | 746.0110 Ops/s | 688.2731 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5269ms | 1.3111ms | 762.7236 Ops/s | 698.7072 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3703ms | 0.9104ms | 1.0984 KOps/s | 991.2717 Ops/s | |
test_seq_add[eager] | 0.2585ms | 94.5135μs | 10.5805 KOps/s | 10.1731 KOps/s | |
test_seq_add[compile] | 0.2527ms | 81.8677μs | 12.2148 KOps/s | 12.6091 KOps/s | |
test_seq_add[compile-overhead] | 0.2661ms | 0.1154ms | 8.6630 KOps/s | 8.7997 KOps/s | |
test_seq_wrap[eager] | 0.5371ms | 0.3615ms | 2.7664 KOps/s | 2.5825 KOps/s | |
test_seq_wrap[compile] | 0.5111ms | 0.3106ms | 3.2193 KOps/s | 3.1849 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3944ms | 0.2188ms | 4.5704 KOps/s | 4.5848 KOps/s | |
test_func_call_runtime[False-eager] | 0.8889ms | 0.7279ms | 1.3739 KOps/s | 1.3450 KOps/s | |
test_func_call_runtime[False-compile] | 0.9500ms | 0.7864ms | 1.2716 KOps/s | 1.2662 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4806ms | 0.3584ms | 2.7905 KOps/s | 2.7947 KOps/s | |
test_func_call_runtime[True-eager] | 1.0571ms | 0.8908ms | 1.1226 KOps/s | 1.1078 KOps/s | |
test_func_call_runtime[True-compile] | 0.9822ms | 0.8266ms | 1.2098 KOps/s | 1.1931 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5343ms | 0.3912ms | 2.5560 KOps/s | 2.5578 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8669ms | 0.7181ms | 1.3926 KOps/s | 1.2806 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.0370ms | 0.7855ms | 1.2731 KOps/s | 1.2558 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5075ms | 0.3604ms | 2.7750 KOps/s | 2.7946 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1493ms | 0.9900ms | 1.0101 KOps/s | 1.0026 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0163ms | 0.8513ms | 1.1747 KOps/s | 1.1698 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5630ms | 0.4159ms | 2.4043 KOps/s | 2.3860 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4852ms | 2.0439ms | 489.2607 Ops/s | 483.3691 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0468ms | 0.8645ms | 1.1568 KOps/s | 1.1034 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5865ms | 0.4230ms | 2.3643 KOps/s | 2.3802 KOps/s | |
test_distributed | 2.8341ms | 0.1795ms | 5.5700 KOps/s | 8.6588 KOps/s | |
test_tdmodule | 24.5810μs | 13.6059μs | 73.4977 KOps/s | 58.6015 KOps/s | |
test_tdmodule_dispatch | 0.1248ms | 27.6294μs | 36.1933 KOps/s | 30.8807 KOps/s | |
test_tdseq | 33.6210μs | 14.4020μs | 69.4347 KOps/s | 56.6757 KOps/s | |
test_tdseq_dispatch | 60.1510μs | 29.3054μs | 34.1234 KOps/s | 28.0713 KOps/s | |
test_instantiation_functorch | 1.9682ms | 1.8045ms | 554.1852 Ops/s | 538.2507 Ops/s | |
test_instantiation_td | 1.7655ms | 1.1747ms | 851.2808 Ops/s | 842.6896 Ops/s | |
test_exec_functorch | 0.3555ms | 0.2040ms | 4.9018 KOps/s | 4.8092 KOps/s | |
test_exec_functional_call | 0.3685ms | 0.2062ms | 4.8499 KOps/s | 4.7966 KOps/s | |
test_exec_td | 0.3606ms | 0.2076ms | 4.8180 KOps/s | 4.6778 KOps/s | |
test_exec_td_decorator | 0.9459ms | 0.2490ms | 4.0153 KOps/s | 3.9052 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8854ms | 0.7031ms | 1.4223 KOps/s | 1.4558 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8288ms | 0.6677ms | 1.4976 KOps/s | 1.4561 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7480ms | 0.5869ms | 1.7037 KOps/s | 1.7522 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7894ms | 0.5873ms | 1.7027 KOps/s | 1.7395 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8600ms | 0.6673ms | 1.4987 KOps/s | 1.4899 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9755ms | 0.6784ms | 1.4741 KOps/s | 1.4865 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7845ms | 0.5906ms | 1.6931 KOps/s | 1.6597 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7427ms | 0.5811ms | 1.7210 KOps/s | 1.7076 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.4595ms | 8.2950ms | 120.5545 Ops/s | 119.1847 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.4254ms | 8.2614ms | 121.0454 Ops/s | 119.2929 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.5172ms | 8.0722ms | 123.8824 Ops/s | 122.2765 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.3251ms | 8.0825ms | 123.7241 Ops/s | 122.5102 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.1908ms | 19.3874ms | 51.5799 Ops/s | 51.2024 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.1058ms | 19.3591ms | 51.6554 Ops/s | 51.0241 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.3343ms | 19.2786ms | 51.8710 Ops/s | 51.5653 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.3764ms | 19.2352ms | 51.9881 Ops/s | 51.5128 Ops/s | |
test_to_module_speed[True] | 1.9631ms | 0.9293ms | 1.0761 KOps/s | 1.0757 KOps/s | |
test_to_module_speed[False] | 1.0036ms | 0.9047ms | 1.1053 KOps/s | 1.1009 KOps/s | |
test_tc_init | 66.5610μs | 31.2642μs | 31.9855 KOps/s | 26.9122 KOps/s | |
test_tc_init_nested | 86.8310μs | 60.7593μs | 16.4584 KOps/s | 13.0997 KOps/s | |
test_tc_first_layer_tensor | 26.7406μs | 0.6749μs | 1.4817 MOps/s | 1.4624 MOps/s | |
test_tc_first_layer_nontensor | 0.1730ms | 2.2214μs | 450.1657 KOps/s | 448.4904 KOps/s | |
test_tc_second_layer_tensor | 30.2955μs | 1.3502μs | 740.6051 KOps/s | 729.1927 KOps/s | |
test_tc_second_layer_nontensor | 0.2054ms | 2.9010μs | 344.7057 KOps/s | 341.5801 KOps/s | |
test_unbind | 0.1992s | 11.2007ms | 89.2798 Ops/s | 91.2149 Ops/s | |
test_full_like | 0.7587ms | 0.5733ms | 1.7444 KOps/s | 1.7396 KOps/s | |
test_zeros_like | 0.3518ms | 0.1983ms | 5.0417 KOps/s | 5.0410 KOps/s | |
test_ones_like | 0.3474ms | 0.1981ms | 5.0479 KOps/s | 5.0466 KOps/s | |
test_clone | 0.5717ms | 0.4151ms | 2.4092 KOps/s | 2.4157 KOps/s | |
test_squeeze | 33.7310μs | 9.8503μs | 101.5197 KOps/s | 102.6478 KOps/s | |
test_unsqueeze | 0.2877ms | 72.3038μs | 13.8305 KOps/s | 13.5723 KOps/s | |
test_split | 0.2579ms | 0.1583ms | 6.3182 KOps/s | 6.4760 KOps/s | |
test_permute | 0.3097ms | 0.1749ms | 5.7186 KOps/s | 5.7779 KOps/s | |
test_stack | 1.3490ms | 0.8491ms | 1.1777 KOps/s | 1.1698 KOps/s | |
test_cat | 1.3595ms | 1.2322ms | 811.5864 Ops/s | 811.5487 Ops/s |
vmoens
added a commit
that referenced
this pull request
Sep 9, 2024
ghstack-source-id: 6589677e9635efb3bcffb7e639b69e346008b09a Pull Request resolved: #954
vmoens
added a commit
that referenced
this pull request
Sep 16, 2024
ghstack-source-id: 0cd65696a91d83674212ca9a62dce02d1cabf44d Pull Request resolved: #954
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):