Better user experience for llamacpp-server #10012
PierreCarceller
started this conversation in
Ideas
Replies: 1 comment 5 replies
-
Let me know if you need more help or if you have a specific example that you would like demonstrated. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello!
I'm a llamacpp-server user. In particular the
openai
apis.In a dream world I would like :
A bit like what you can do at VLLM
I can think of 2 alternative solutions that are a little less practical but easier to set up.
Solution 1 : Create a
/message-format
endpoint that would apply the template chat to the list of messages sent and then use the/completion
endpoint that already exists.Solution 2 : Let the client format the message list on its own (with jinja for example) before using the
/completion
endpoint. But in this case, the server must provide access to the information needed to do the job on the client side (the BOS token, the EOS token...etc).I hope I haven't missed any important information.
Beta Was this translation helpful? Give feedback.
All reactions