You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The image builds correctly and runs but when I try your example command i get the following message in the logs:
[2023-07-14 13:53:18,408][saicinpainting.training.trainers.base][INFO] - BaseInpaintingTrainingModule init done
[2023-07-14 13:53:18,627][__main__][CRITICAL] - Prediction failed due to Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 803: system has unsupported display driver / cuda driver combination:
Traceback (most recent call last):
File "bin/predict.py", line 59, in main
model.to(device)
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/pytorch_lightning/core/decorators.py", line 89, in inner_fn
module = fn(self, *args, **kwargs)
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/pytorch_lightning/utilities/device_dtype_mixin.py", line 120, in to
return super().to(*args, **kwargs)
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1145, in to
return self._apply(convert)
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 820, in _apply
param_applied = fn(param)
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "/opt/conda/envs/object-removal/lib/python3.8/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init
torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 803: system has unsupported display driver / cuda driver combination
Because of this a following error occurcs
FileNotFoundError: [Errno 2] No such file or directory: '/app/object-removal/experiments/real/001/data/../lama_depth_output_real/000_mask001.png'
and also fails JAX to find a GPU
W0714 13:53:32.249409 140354252236608 xla_bridge.py:363] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
I have a RTX 4090 with this driver and cuda version in the docker container: Driver Version: 535.54.03 CUDA Version: 11.8
Could you please look into it? I tried to use another Cuda12.0 Container as base image then the pytorch error resolves but not the JAX error that implies it does not find the GPU.
Thank you
The text was updated successfully, but these errors were encountered:
NVIDIA's nightly JAX containers are available here: https://github.com/NVIDIA/JAX-Toolbox with open Dockerfiles. I'd recommend starting from a base image here and adding PyTorch and other libs.
Hello, thank you for your amazing work! I want to try it and used the docker instructions you provided here:
https://github.com/nianticlabs/nerf-object-removal/blob/main/docker/README.md
The image builds correctly and runs but when I try your example command i get the following message in the logs:
Because of this a following error occurcs
and also fails JAX to find a GPU
I have a RTX 4090 with this driver and cuda version in the docker container: Driver Version: 535.54.03 CUDA Version: 11.8
Could you please look into it? I tried to use another Cuda12.0 Container as base image then the pytorch error resolves but not the JAX error that implies it does not find the GPU.
Thank you
The text was updated successfully, but these errors were encountered: