-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install mkl for PyTorch and libjpeg/libpng for TorchVision #8248
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8248
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c5dcd2d with merge base b1d76c9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you've got conda library problems but this is clearly a PR I should accept, contingent on you seeing llava-runner job improve
Let me look more into that conda issue. This is rather unexpected, I have done a rebase, so the failure is legit. |
Just FYI, the linking error is from this hack https://github.com/pytorch/executorch/blob/main/.ci/docker/common/install_conda.sh#L46-L58 |
The run time is around 1h20m now https://github.com/pytorch/executorch/actions/runs/13229090225/job/36927181503 v.s 2h30m https://github.com/pytorch/executorch/actions/runs/13213708877/job/36890417988 |
eval_llama-mmlu seems to have gone from 1.9h to under 9 minutes: https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=mmlu&mergeLF=true |
#8173 raised these timeouts. Now that #8248 has landed to fix #8180, we should be able to lower them again. (I'm sending this early so I don't forget; double-check llava-runner running time) ghstack-source-id: cb4c1691907b8bb46a504a2d8cbc00d12b1ef4a4 ghstack-comment-id: 2648474106 Pull Request resolved: #8339
While debugging the build issue on #8322 w.r.t mkl, I undercover a complex interaction between #8322, #8248 (to install mkl), and https://github.com/pytorch/pytorch/blob/main/cmake/public/mkl.cmake from PyTorch. The error is as follows: ``` CMake Error at /opt/conda/envs/py_3.10/lib/cmake/mkl/MKLConfig.cmake:744 (add_library): <-- This file comes from conda mkl add_library cannot create imported target "MKL::MKL" because another target with the same name already exists. Call Stack (most recent call first): /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/share/cmake/Caffe2/public/mkl.cmake:1 (find_package) <-- this is from PyTorch /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:106 (include) /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) CMakeLists.txt:753 (find_package) ``` The conclusion is that, with mkl installed, there should be just one `find_package(Torch)` call because the mkl target is defined globally. The `torch` target, on the other hand, is only defined locally. So, this change adds `if(NOT TARGET torch)` check to only call `find_package(Torch)` if needed. ### Testing The change on top of #8322 looks like this f705b01 https://github.com/pytorch/executorch/actions/runs/13278590926?pr=8399
Fixes #8180
I should have caught this earlier that we didn't have mkl here, so PyTorch performance on x86 CPU suffered.