-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too many tokens #4
Comments
It's a limitation on the model (to 2048 tokens), and not the trial key. Length is indeed a limitation of this approach. Note that the 6354 number in the message does not refer to your records. It refers to the number of tokens in the naming prompt. Here's an example of how that looks. Say we're naming these two clusters: Topically starts by naming cluster 0. To do that: Then does the same with the rest of the clsuters. Two ways to get around the length limitation:
|
Thanks for the detailed explanation. I understand the 2048 token limitation, but for some reason I am still getting the error when I throttle the sample length down. Let me see if I'm understanding the flow correctly for example:
So where
In other words will I hit the 2048 limit when the length the sampled records x number of samples > 2048? If so then how many samples are concatenated? Hope this is clear. |
Clear! The samples are concatenated to the first (generic) prompt in this file: https://github.com/cohere-ai/sandbox-topically/blob/main/topically/prompts/prompts.py It shows a few examples for the types of names of clusters. I believe with the new Command model we should be able to use shorter prompts and fit in more examples.
10 Is the default value. You can change that by passing |
Hi @jalammar,
With my dataset I'm getting
Is this b/c Cohere can't handle more than 2048 or is it a limitation on the freebie key I'm using? My toy data is 6354 records.
Thanks!
The text was updated successfully, but these errors were encountered: