Common Imports Fix and Readme Update to fix RuntimeError in trainer.fit() #216

scorixear · 2023-03-24T10:03:09Z

Running example code with current package creates following errors:

cannot import name 'DeepSpeedPlugin' from 'pytorch_lightning.plugins - aitextgen.py line 14
cannot import name 'ProgressBarBase' from 'pytorch_lightning.callbacks.progress - train.py line 13
cannot import name '_TPU_AVAILABLE' from 'pytorch_lightning.utilities - train.py line 14 - fixed in update pytorch-lightning requirement to >= 1.8.0 #202
Runtime Error: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. - aitextgen.py line 752

The Runtime error suggests wrapping the user code in a main function as hinted here https://discuss.pytorch.org/t/runtimeerror-an-attempt-has-been-made-to-start-a-new-process-before-the-current-process-has-finished-its-bootstrapping-phase/145462

But I cannot confirm if this fixes the issue as the current code does not progress at all (Might also because ProgressBar is not the correct replacement for ProgressBarBase.

Would love to have your input if theses changes actually work!

…me to reflect training wrap in __main__

scorixear · 2023-03-24T11:18:08Z

After around 1 Hour of training the program finished correctly, although the progress bar seems to be broken

vjarora1978 · 2023-03-25T16:15:28Z

Getting this error while executing the example

scorixear · 2023-03-25T16:24:07Z

Getting this error while executing the example

yes I get the same error, I will investigate whats up

scorixear · 2023-03-25T16:46:39Z

Getting this error while executing the example

seems like ProgressBarBase contained the "loss" tensor for version 1.8.6, but got removed in ProgressBar version 2.0.0 (the latest of pytorch lightning)

I replaced the metrics with the outputs loss value - this doesn't affect the training code at all, its just about the progress bar viewing current and average loss

fictionFanKazuki · 2023-05-02T14:55:54Z

this is a really helpful pull req, thanks a lot! however, i still get an error about the kwarg "gpus" being unkown in pytorch's argsparse.py? "gpus" seemed to be part of that trainer object thing in train.py, could you help?

TypeError                                 Traceback (most recent call last)
<ipython-input-11-341925ca7a1c> in <cell line: 1>()
----> 1 ai.train(file_name,
      2          line_by_line=False,
      3          from_cache=False,
      4          num_steps=3000,
      5          generate_every=300,

1 frames
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/argparse.py in insert_env_defaults(self, *args, **kwargs)
     67 
     68         # all args were already moved to kwargs
---> 69         return fn(self, **kwargs)
     70 
     71     return cast(_T, insert_env_defaults)

TypeError: Trainer.__init__() got an unexpected keyword argument 'gpus'

scorixear · 2023-05-07T21:01:04Z

@fictionFanKazuki

this is a really helpful pull req, thanks a lot! however, i still get an error about the kwarg "gpus" being unkown in pytorch's argsparse.py? "gpus" seemed to be part of that trainer object thing in train.py, could you help?

TypeError                                 Traceback (most recent call last)
<ipython-input-11-341925ca7a1c> in <cell line: 1>()
----> 1 ai.train(file_name,
      2          line_by_line=False,
      3          from_cache=False,
      4          num_steps=3000,
      5          generate_every=300,

1 frames
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/argparse.py in insert_env_defaults(self, *args, **kwargs)
     67 
     68         # all args were already moved to kwargs
---> 69         return fn(self, **kwargs)
     70 
     71     return cast(_T, insert_env_defaults)

TypeError: Trainer.__init__() got an unexpected keyword argument 'gpus'

Hm, not sure how to reproduce this.
I have changed the "gpus" arguments to"num_nodes" in my latest commit. Maybe you haven't used the latest one there?

Otherwise there is probably a new version of pytorch_lightning that had more breaking changes. But i would need to know which version you have installed there and/or the full stack trace as I canot deciver where the utilities function was called from.

On my machine with my version of pytorch_lightning (2.0.0) it works. I will push a restricted requirements.txt shortly

setup.py

Vectorrent · 2023-07-03T02:18:45Z

Thanks for this! I merged these fixes into my custom fork of AITextGen, and it allowed me to upgrade to PL v2.0.4 successfully!

scorixear added 2 commits March 24, 2023 10:23

changed imports for DeepSpeedPlugin and ProgressBarBase

c7d5a66

changed pytroch_lightning trainer gpus arg to num_nodes; updated read…

9ce07e9

…me to reflect training wrap in __main__

mpsparrow mentioned this pull request Mar 25, 2023

Error while using the Google collab : cannot import name 'DeepSpeedPlugin' from 'pytorch_lightning.plugins' #215

Open

fixed metrics not containing loss, reverted to outputs

959accc

scorixear mentioned this pull request Mar 25, 2023

ImportError: cannot import name 'ProgressBarBase' from 'pytorch_lightning.callbacks.progress' #214

Open

scorixear mentioned this pull request Apr 12, 2023

still getting the same error. #218

Open

scorixear added 2 commits May 7, 2023 23:06

restricted requirements to be two-version exact

f0223b9

fixed wrong fire version and setup.py

4032109

obviyus reviewed May 15, 2023

View reviewed changes

setup.py Outdated Show resolved Hide resolved

reverted to 0.5.0 fire version

86e4be7

Vectorrent mentioned this pull request Jul 3, 2023

Change 'ProgressBarBase' to 'ProgressBar' so it works on the (currently) latest PyTorch Lightning version #227

Open

13rac1 mentioned this pull request Oct 6, 2023

getting error while importing aitextgen #233

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Common Imports Fix and Readme Update to fix RuntimeError in trainer.fit() #216

Common Imports Fix and Readme Update to fix RuntimeError in trainer.fit() #216

scorixear commented Mar 24, 2023

scorixear commented Mar 24, 2023

vjarora1978 commented Mar 25, 2023

scorixear commented Mar 25, 2023

scorixear commented Mar 25, 2023

fictionFanKazuki commented May 2, 2023 •

edited

Loading

scorixear commented May 7, 2023

Vectorrent commented Jul 3, 2023

Common Imports Fix and Readme Update to fix RuntimeError in trainer.fit() #216

Are you sure you want to change the base?

Common Imports Fix and Readme Update to fix RuntimeError in trainer.fit() #216

Conversation

scorixear commented Mar 24, 2023

scorixear commented Mar 24, 2023

vjarora1978 commented Mar 25, 2023

scorixear commented Mar 25, 2023

scorixear commented Mar 25, 2023

fictionFanKazuki commented May 2, 2023 • edited Loading

scorixear commented May 7, 2023

Vectorrent commented Jul 3, 2023

fictionFanKazuki commented May 2, 2023 •

edited

Loading