You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since I struggled with the lack of clear and consistent documentation and therefore ran into a lot of issues when I trained a prior model with rave v2.31, I want to share my insights here.
My goal was to train a model for unconditional generation with ~nn in Max.
My setup is the following:
rave 2.31
python 3.11.9 in a virtual conda environment
windows machine with RTX3080
What did not work:
Training a v2 rave model and v1 prior: no success, could not get it work. Ran into multiple runtime errors.
Training a v2 rave model with msprior and using it with nn~: the training worked, but Max example did not work, no clue about the different configurations and how they are supposed to combined/used
Training v1 rave and prior models with rave v1: could not even train the rave model (TypeError: fit() got an unexpected keyword argument 'ckpt_path', did not pass a ckpt path btw.)
What finally worked with some modifications:
Training a v1 rave model with rave 2.3.1 and v1 prior:
First i needed to explicitely pass the .gin config files with the --config option. Since they where not found in the conda environment (see #303, #289, #259). At some point I manuallycopied the files to the expected locations.
Here is the command that I used for training the rave model:
RuntimeError: output with shape [1, 256, 1] doesn't match the broadcast shape [4, 256, 1]
since this happened when the export was testing against mc.nn~ which I did not plan to use anyway (__init__.py:100] Testing method prior with mc.nn~ API), I commented out the testing in the python file for mc.nn~ (init.py in site-packages/nn_tilde) and finally the prior export worked and I could use it with nn~.
I hope this can be helpful for someone!
It is really a pity that there is so much confusion and such a steep learning curve using RAVE (even as a developer). Looking at all the open issues and the despair it causes for people who want to use the codebase, I don't understand the lack of motivation to deal with this. Don't get me wrong, I am very happy and thankful that RAVE is open source and that this great work is available to the public, but still, I don't get why there is so little effort put into providing consistent documentation with working examples (see #278, #299, #300). Or at least being transparent about why it does not happen. I would not complain if this was a small project people are doing in their free time, but RAVE is a funded project at a renowned institution. Just my two cents after a lot of frustration...
The text was updated successfully, but these errors were encountered:
Hi, thank you for your information. However, I'm still facing this error saying unable to open "default.gin" although I already specify my path to v1.gin. Here's my command: rave train --config rave/configs/v1.gin --config default --db_path dataset_new/ --name mymodel —val_every 2500 --channels 1 --gpu 0
Appreciate if you could provide some advices on this. Thank you!
Since I struggled with the lack of clear and consistent documentation and therefore ran into a lot of issues when I trained a prior model with
rave v2.31
, I want to share my insights here.My goal was to train a model for unconditional generation with
~nn
in Max.My setup is the following:
What did not work:
nn~
: the training worked, but Max example did not work, no clue about the different configurations and how they are supposed to combined/usedTypeError: fit() got an unexpected keyword argument 'ckpt_path'
, did not pass a ckpt path btw.)What finally worked with some modifications:
Training a v1 rave model with rave 2.3.1 and v1 prior:
First i needed to explicitely pass the
.gin
config files with the --config option. Since they where not found in the conda environment (see #303, #289, #259). At some point I manuallycopied the files to the expected locations.Here is the command that I used for training the rave model:
rave train --config /path/to/v1.gin --config default --db_path /path/to/preoprocessed/files --name name--val_every 2500 --channels 1 --gpu 0
A used a similar command for the prior:
rave train_prior --config /path/to/prior_v1.gin --model /path/to/rave/run --db_path /path/to/preprocessed/files --name name --out_path /output/path --val_every 2500 --gpu 0
But again I ran into an error:
RuntimeError: please init Prior with either fidelity or latent_size keywords
(see #281)Adding fidelity=0.95 to the prior config helped me to get rid of the error and training finally worked.
But again, when I wanted to export the combined models (as described in the Readme) with:
rave export --run /path/to/your/run --prior /path/to/your/prior --streaming
I got the following Python Runtime Error:
RuntimeError: output with shape [1, 256, 1] doesn't match the broadcast shape [4, 256, 1]
since this happened when the export was testing against
mc.nn~
which I did not plan to use anyway (__init__.py:100] Testing method prior with mc.nn~ API
), I commented out the testing in the python file formc.nn~
(init.py in site-packages/nn_tilde) and finally the prior export worked and I could use it withnn~
.I hope this can be helpful for someone!
It is really a pity that there is so much confusion and such a steep learning curve using RAVE (even as a developer). Looking at all the open issues and the despair it causes for people who want to use the codebase, I don't understand the lack of motivation to deal with this. Don't get me wrong, I am very happy and thankful that RAVE is open source and that this great work is available to the public, but still, I don't get why there is so little effort put into providing consistent documentation with working examples (see #278, #299, #300). Or at least being transparent about why it does not happen. I would not complain if this was a small project people are doing in their free time, but RAVE is a funded project at a renowned institution. Just my two cents after a lot of frustration...
The text was updated successfully, but these errors were encountered: