Skip to content

koboldcpp-1.69.1

Compare
Choose a tag to compare
@LostRuins LostRuins released this 01 Jul 06:44
· 970 commits to concedo since this release

koboldcpp-1.69.1

  • Fixed an issue when selecting ubatch, which should now correctly match the blasbatchsize
  • Added separator tokens when selecting multiple images with LLaVA. Unfortunately, the model still tends to get mixed up and confused when working with multiple images in the same request.
  • Added a set of premade Chat Completions adapters selectable in the GUI launcher (thanks @henk717) which provide an easy instruct templates for various models and formats, should you want to use third party OpenAI based (chat completion) frontends along with KoboldCpp. This can help you override the instruct format even if the frontend does not directly support it. For more information on --chatcompletionsadapter see the wiki.
  • Allow inserting an extra added forced positive or forced negative prompt for stable diffusion (set add_sd_prompt and add_sd_negative_prompt in a loaded adapter).
  • Switched over the KoboldCpp Colab to use precompiled linux binaries, it starts and run much faster now. The Huggingface Tiefighter Space example has also been updated likewise (thanks @henk717) . Lastly, added information about using KoboldCpp in RunPod at https://koboldai.org/runpodcpp/
  • Fixed some utf decode errors.
  • Added tensor split GUI launcher input field for Vulkan.
  • Merged fixes and improvements from upstream, including the improved mmq with int8 tensor core support and gemma 2 features have been merged.
  • Updated Kobold Lite chatnames stopper for instruct mode. Also, Kobold Lite can now fall back to an alternative API or endpoint URL if the connection fails, you may attempt to reconnect using the OpenAI API instead, or to use a different URL.

1.69.1 - Merged the fixes for gemma 2 and IQ mmvq

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster).
If you're using Linux, select the appropriate Linux binary file instead (not exe).
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.