Add a base path for ollama #25

oppenheimer- · 2024-10-14T14:19:25Z

          would it be possible to add a base path for ollama, please?

maybe similar to smart second brain or other plugins that use ollama.

the route "/api/tags" delivers all the models to populate the list.

Originally posted by @oppenheimer- in #5 (comment)

The full API docs are here:
Ollama API docs

the GET /api/tags endpoint provides basic information. e.g.

{
   "models": [
      {
         "name": "qwen2.5-coder:7b-instruct",
         "model": "qwen2.5-coder:7b-instruct",
         "modified_at": "2024-10-08T08:59:00+02:00",
         "size": 4683087590,
         "digest": "87098ba7390d43e0f8d615776bc7c4372c9e568c436bc1933f93832f9cf09b84",
         "details": {
            "parent_model": "",
            "format": "gguf",
            "family": "qwen2",
            "families": [
               "qwen2"
            ],
            "parameter_size": "7.6B",
            "quantization_level": "Q4_K_M"
         }
      }
   ]
}

a subsequent POST request to curl http://localhost:11434/api/show -d '{
"name": "llama3.2"
}'
will reveal the required information

{
  "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llava:latest\n\nFROM /Users/matt/.ollama/models/blobs/sha256:200765e1283640ffbd013184bf496e261032fa75b99498a9613be4e94d63ad52\nTEMPLATE \"\"\"{{ .System }}\nUSER: {{ .Prompt }}\nASSISTANT: \"\"\"\nPARAMETER num_ctx 4096\nPARAMETER stop \"\u003c/s\u003e\"\nPARAMETER stop \"USER:\"\nPARAMETER stop \"ASSISTANT:\"",
  "parameters": "num_keep                       24\nstop                           \"<|start_header_id|>\"\nstop                           \"<|end_header_id|>\"\nstop                           \"<|eot_id|>\"",
  "template": "{{ if .System }}<|start_header_id|>system<|end_header_id|>\n\n{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>\n\n{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ .Response }}<|eot_id|>",
  "details": {
    "parent_model": "",
    "format": "gguf",
    "family": "llama",
    "families": [
      "llama"
    ],
    "parameter_size": "8.0B",
    "quantization_level": "Q4_0"
  },
  "model_info": {
    "general.architecture": "llama",
    "general.file_type": 2,
    "general.parameter_count": 8030261248,
    "general.quantization_version": 2,
    "llama.attention.head_count": 32,
    "llama.attention.head_count_kv": 8,
    "llama.attention.layer_norm_rms_epsilon": 0.00001,
    "llama.block_count": 32,
    "llama.context_length": 8192, // this is what you were looking for
    "llama.embedding_length": 4096,
    "llama.feed_forward_length": 14336,
    "llama.rope.dimension_count": 128,
    "llama.rope.freq_base": 500000,
    "llama.vocab_size": 128256,
    "tokenizer.ggml.bos_token_id": 128000,
    "tokenizer.ggml.eos_token_id": 128009,
    "tokenizer.ggml.merges": [],            // populates if `verbose=true`
    "tokenizer.ggml.model": "gpt2",
    "tokenizer.ggml.pre": "llama-bpe",
    "tokenizer.ggml.token_type": [],        // populates if `verbose=true`
    "tokenizer.ggml.tokens": []             // populates if `verbose=true`
  }
}

if im not mistaken, the context_length can be acquired with the model family name.

The text was updated successfully, but these errors were encountered:

jcollingj · 2025-01-12T21:22:19Z

Howdy @oppenheimer-

I totally get the request here. The hold up is that there is some functionality in Caret that can only be done with models that support functional calling. Unfortunately I don't think the api here provides that information.

I think the easiest solution would be to allow you to import the models and then have a UI for marking if they support function calling or not.

No current ETA on when I'll be able to get around to that. Going to leave this open for now.

oppenheimer- · 2025-01-13T13:29:12Z

Maybe there's a way to implement a 'preflight' check with Ollama when users save options or select a model, which would verify compatibility. However, I recommend a simpler approach: let users learn through trial and error which models work for their setup. This avoids unnecessary complexity in the implementation. Then give simple advice for working models (Phi 4, Llama 3.2/3.3 etc.)

jcollingj · 2025-01-13T14:49:27Z

I'm not particularly keen on users having to trial and error which models work for which functionality in Caret. I think an import flow threads the needle the best.

Caret can import most of the data from ollama or other providers an then the user just has to adjust a couple settings for each they want to import.

I might take a look at this over the next few weeks. But this would also be a great first issue if anyone from the community wants to take a crack at it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a base path for ollama #25

Add a base path for ollama #25

oppenheimer- commented Oct 14, 2024 •

edited

Loading

jcollingj commented Jan 12, 2025

oppenheimer- commented Jan 13, 2025

jcollingj commented Jan 13, 2025

Add a base path for ollama #25

Add a base path for ollama #25

Comments

oppenheimer- commented Oct 14, 2024 • edited Loading

jcollingj commented Jan 12, 2025

oppenheimer- commented Jan 13, 2025

jcollingj commented Jan 13, 2025

oppenheimer- commented Oct 14, 2024 •

edited

Loading