Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you specify model with CLI interface? #201

Open
rmc135 opened this issue Dec 17, 2022 · 0 comments
Open

How do you specify model with CLI interface? #201

rmc135 opened this issue Dec 17, 2022 · 0 comments

Comments

@rmc135
Copy link

rmc135 commented Dec 17, 2022

I can't seem to find any explanation of flags for the CLI interface, particularly, how to specify the model.

I looked through aitextgen.py for some hints. Using "--tf_gpt2" kind of works:

aitextgen generate --tf_gpt2=124M --prompt "I believe in unicorns because" --to_file False

The model is downloaded, but then execution bombs out with the error:

ValueError: The following `model_kwargs` are not used by the model: ['tf_gpt2'] (note: typos in the generate arguments will also show up in this list)

I think what is happening is that aitextgen recognises the flag (for example, --tf_gpt2=355M downloads the 355M model as expected, so it's not just using a default), but then passes all CLI parms unchanged to transformers, which errors out on that unknown arg?

Is it as simple as the CLI interface voiding kwargs variables like tf_gpt2 before calling transformers generate(), or does more need to be done?

I'm not a Python programmer, so apologies for the clunky understanding. Thank you.


Edit: hacking in del kwargs['tf_gpt2'] just before the call to self.model.generate seems to have done the trick, but I'm unsure whether this is the best approach.

        try:
          kwargs['tf_gpt2']
        except KeyError:
          pass
        else:
          del kwargs['tf_gpt2']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant