-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows wheel build error #6465
Comments
Hi @divi0001 - it looks like you are using Windows, is that correct? If so, we recommend using the pre-built wheel that we provide on PyPI, so you can install via just |
Yes, it is windows 10. When i use pip install deepseed, it's basically the same:
× python setup.py egg_info did not run successfully. note: This error originates from a subprocess, and is likely not a problem with pip. × Encountered error while generating package metadata. note: This is an issue with the package mentioned above, not pip. |
And confirming if you build from source that you set all of the following to 0?
|
DS_BUILD_GDS, those are set to 0, but only via cmd with set DS_NAME_GOES_HERE=0 |
Yes, that is correct, but you can have it all run in a single file if you run it with build_win.bat. |
alright. I'm going to look for prebuilt wheel then, the one you linked didn't work (wrong platform). Thanks for the help :) |
What platform are you using? I believe it should just be the python version? Pip should pick it up if you are using python 3.11 |
python 3.10.0 |
Can you try with python 3.11 since that's what the whl is built wtih? |
worked with 3.11 thank you! Any chance 3.10 will be supported with this architecture in the future? |
Possibly, we would need to determine how many different configurations we will support, but we haven't not decided on that yet. |
I am using python 3.11.9 on windows 10, |
@Kas1o - what cuda version? Can you also confirm the DeepSpeed version you are downloading/specifying? Please try 0.15.0, or download the whl listed here and pip install it? |
Hi @loadams Re DeepSpeed Windows wheel file. I'm on: Python Version : 3.11.10
When you manually download the version you provided from your link and install that with However, when a python script imports deepspeed, you do get the 2x following errors:
Just so you have a clear understanding of my OS: OPERATING SYSTEM: So not sure if these errors are problems in the v15 pip package? Also it may be handy to have the link to the updated whl files and manual install instructions within the readme for Windows DeepSpeed https://github.com/microsoft/DeepSpeed/blob/master/blogs/windows/08-2024/README.md Hope that helps you internally figure what needs changing or feed back here/on the readme about what we may need to do (or if we should be ignoring those errors). Thanks |
In addition to my above post, I assume the whl file is built only for Pytorch 2.3 as despite installing fine, when I use it, I do get the following error:
Though the whl being built only for PyTorch 2.3 isn't clearly stated in the readme https://github.com/microsoft/DeepSpeed/blob/master/blogs/windows/08-2024/README.md Also I it may be handy to have to front page Github instructions for Windows link to that readme. https://github.com/microsoft/DeepSpeed?tab=readme-ov-file#windows Thanks |
Hi @erew123 - yes that is correct, I can update the readme with that clearer information. We also need to publish the 0.15.1 whl for Windows since that may be part of the issue as well. To make sure I understand the state of your install, you're able to install the specific whl file now, but you're getting issues with the RuntimeError on the python version, and you're having issues with those .lib files? Are you able to share the output of ds_report from that install? |
Hi @loadams Thanks for the reply! Ive managed to get the ds_report remotely from my machine, though I am now away travelling for a while, so will have limited access to my machine and to respond, but I will respond when I can. Just to remind, my Torch on this machine Windows 11/python environment is 2.2.1 and the wheel (And ds_report at the bottom) does say the wheel is compiled for Torch 2.3 and CUDA 12.1. I know from the amount of DeepSpeed wheels Ive compiled over the last year you have to compile for the specific verison of Torch and CUDA... but I dont know if the
mesages are becuase of it being a different torch version. My full machine specs are on the above post and my ds_report is below.
Thanks |
are there any wheels for 3.10? i have everything torch cuda c++ anything but still fails
|
Hi @FurkanGozukara - there are no wheels published for 3.10 yet, the latest whl we have published is 3.11. But your installation is still picking the linux whl, you would need python 3.11 and to specify this DeepSpeed version to get the whl here: https://pypi.org/project/deepspeed/0.15.0/#files |
how do i make it get windows file? can you elaborate more a pip install command perhaps? |
@loadams i need to install on python 3.10 on windows any help appreciated thank you can't you release a whl for python 3.10? for windows? |
To use the whl file I linked you would need python 3.11 - but if your system is Windows and has python 3.11, it should just be:
This should look at what your system supports and grab that whl. If you want to try building from source, from that commit, but with python 3.10, that should work as well with the |
@loadams can you please publish python 3.10 wheel i dont know how to make |
@FurkanGozukara These are manual build instructions here https://github.com/erew123/deepspeedpatcher?tab=readme-ov-file#manual-builds-of-deepspeed-0150-and-later These are written by myself, not Microsoft. I provide no support/assistance on this or the process and I doubt Microsoft will provide you support on my instructions. However, that will guide you though a manual build, or you can try the tool there. |
@FurkanGozukara - we need to make some changes to our publishing pipeline to support this, we hope to do that soon but have to balance all the work items. We are aware of this and working on having that published. Thanks @erew123! |
@loadams we are looking forward to python 3.10 |
Microsoft Windows [Version 10.0.22631.4541] C:\DeepSpeed>build_win.bat |
I did huge testing but not working yet :( |
@FurkanGozukara As per my manual build instructions, I state you need to build from the Visual Studio Developer console. Looking above you have just run a standard command prompt. It should look like this before you start a manual build. |
@erew123 thank you so much i just did as you instructed running all commands below in single administrator cmd session here my pip freeze - i have cuda paths set
|
after i fixed the setup py - which was written only for Linux by Microsoft :) i got a compile fix is like below
still compiling i hope it works. how can i verify it is working? |
it failed after working huge time i would break my teeth if it was successful :/
entire compile logs 3200 lines |
@FurkanGozukara That may be an issue of build-win.bat changing these settings:
So you may want to make sure those are set in the build-win.bat. Beyond that, I may be something with PyTorch being 2.5 or CUDA being 12.4.... That I really dont know and MS would have to provide you assistance on that. |
@erew123 ty for reply again bat file is like this do you recommend anything else? if i compile it on older pytorch like 2.4 do you think it would work on 2.5? i mean the wheel they provide for python 11 doesnt have certain pytorch version or cuda version : and this is working with pytorch 2.5 and cuda 12.4 i tested - with python 3.11 |
@FurkanGozukara Im honestly not a DeepSpeed expert by any means... but I did get it working/compiling on Windows a year or so ago and dug through so many unclear, misunderstood, incorrect instructions, that when I finally got it working, I decide to share my knowledge, which you will find on this post here #4729 I have since had a need to build more versions and I have distilled my knowledge here https://github.com/erew123/deepspeedpatcher for everything I believe/think I know to be correct, but I don't classify myself to be an expert. I do believe something has changed in the 0.15.x onwards builds of DeepSpeed, in its build routine, but I have not delved into what those changes are. I don't have time to do that. Please see a post here on another project of mine that will give you an insight as to why I dont have time to investigate things like this erew123/alltalk_tts#377. I will say I believe that version 0.15.x onwards you MUST have the Nvidia Toolkit installed as it appears to require access to As for your question re can you build for one version and use it on another version, the answer is no you cannot. Why it is built this way I dont know, but you have to precisely match all requirements of the compiled wheel with the Python/PyTorch/CUDA environment it was built on. I explain this here https://github.com/erew123/deepspeedpatcher?tab=readme-ov-file#important-version-compatibility-information |
I am getting errors while building deepspeed wheel, i set a whole bunch of options to 0 in cmd before since they were also throwing errors it seems, listing them: DS_BUILD_GDS, DS_BUILD_FP_QUANTIZER, DS_BUILD_EVOFORMER_ATTN, DS_BUILD_AIO
Installing dskernels via pip resulted in
pip install dskernels ERROR: Could not find a version that satisfies the requirement dskernels (from versions: none) ERROR: No matching distribution found for dskernels
The previous python setup.py bdist_wheel resulted in:
[2024-08-30 13:42:30,907] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-08-30 13:42:31,331] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) test.c LINK : fatal error LNK1181: cannot open input file 'aio.lib' test.c LINK : fatal error LNK1181: cannot open input file 'cufile.lib' ...\DeepSpeed\deepspeed\runtime\zero\linear.py:49: FutureWarning:
torch.cuda.amp.custom_fwd(args...)is deprecated. Please use
torch.amp.custom_fwd(args..., device_type='cuda')instead. def forward(ctx, input, weight, bias=None): ...\DeepSpeed\deepspeed\runtime\zero\linear.py:67: FutureWarning:
torch.cuda.amp.custom_bwd(args...)is deprecated. Please use
torch.amp.custom_bwd(args..., device_type='cuda')instead. def backward(ctx, grad_output): W0830 13:42:33.709243 10348 torch\distributed\elastic\multiprocessing\redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs. DS_BUILD_OPS=1 test.c LINK : fatal error LNK1181: cannot open input file 'aio.lib' test.c LINK : fatal error LNK1181: cannot open input file 'cufile.lib' [WARNING] Filtered compute capabilities ['6.0', '6.1', '7.0'] Traceback (most recent call last): File "...\DeepSpeed\setup.py", line 197, in <module> ext_modules.append(builder.builder()) File "...\DeepSpeed\op_builder\builder.py", line 719, in builder extra_link_args=self.strip_empty_entries(self.extra_ldflags())) File "...\DeepSpeed\op_builder\inference_cutlass_builder.py", line 74, in extra_ldflags import dskernels ModuleNotFoundError: No module named 'dskernels'
For my specs/installed requirements, i have a relatively new intel CPU, RTX 4070, CUDA v.11.1 and VS22 C++ Desktop installed
The text was updated successfully, but these errors were encountered: