Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colab tensor size mismatch issue #157

Open
jonesmo opened this issue Jan 17, 2023 · 6 comments
Open

Colab tensor size mismatch issue #157

jonesmo opened this issue Jan 17, 2023 · 6 comments
Labels
previous version concerns a previous version of RAVE ; not high-priority

Comments

@jonesmo
Copy link

jonesmo commented Jan 17, 2023

Training RAVE for the first time in Colab. The cells all run successfully through resampling, but when I launch the training step, I get the following error: RuntimeError: The size of tensor a (129) must match the size of tensor b (133) at non-singleton dimension 2

I've pointed it to a directory with a total of 8.25 hours of varied-length, 16k audio files, and the parameters I'm using are sampling_rate=16000, multiband_number=16, n_signal=65538, size=default, prior=32.

Here's the full stack trace:

Sanity Checking: 0it [00:00, ?it/s]/content/miniconda/lib/python3.9/site-packages/torch/utils/data/dataloader.py:487: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(
Sanity Checking DataLoader 0:   0% 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/content/drive/MyDrive/RAVE_COLLAB/train_rave.py", line 175, in <module>
    trainer.fit(model, train, val, ckpt_path=run)
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit
    self._call_and_handle_interrupt(
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 723, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1236, in _run
    results = self._run_stage()
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1323, in _run_stage
    return self._run_train()
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1345, in _run_train
    self._run_sanity_check()
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1413, in _run_sanity_check
    val_loop.run()
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 154, in advance
    dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 128, in advance
    output = self._evaluation_step(**kwargs)
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 226, in _evaluation_step
    output = self.trainer._call_strategy_hook("validation_step", *kwargs.values())
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1765, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/content/miniconda/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 344, in validation_step
    return self.model.validation_step(*args, **kwargs)
  File "/content/drive/MyDrive/RAVE_COLLAB/rave/model.py", line 700, in validation_step
    distance = self.distance(x, y)
  File "/content/drive/MyDrive/RAVE_COLLAB/rave/model.py", line 513, in distance
    lin = sum(list(map(self.lin_distance, x, y)))
  File "/content/drive/MyDrive/RAVE_COLLAB/rave/model.py", line 503, in lin_distance
    return torch.norm(x - y) / torch.norm(x)
RuntimeError: The size of tensor a (129) must match the size of tensor b (133) at non-singleton dimension 2

Thanks for any insight or help!

@gandolfxu
Copy link

I meets the same problem. Have you resolved it?

@jonesmo
Copy link
Author

jonesmo commented Mar 8, 2023

I meets the same problem. Have you resolved it?

Unfortunately no

@domkirke domkirke added the previous version concerns a previous version of RAVE ; not high-priority label Dec 18, 2023
@arjunbahuguna
Copy link

Getting the same issue on the latest version.

@berkut0
Copy link

berkut0 commented May 3, 2024

Same problem with training on local machine

@patrickgates
Copy link

patrickgates commented May 4, 2024

+1 Same on version 2.3.1

@gwendal-lv
Copy link

I had the same issue with the latest version and solved it by modifying rave/model.py (see #309 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
previous version concerns a previous version of RAVE ; not high-priority
Projects
None yet
Development

No branches or pull requests

7 participants