Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Muxing Support for Model Routing #566

Closed
lukehinds opened this issue Jan 13, 2025 · 11 comments
Closed

Muxing Support for Model Routing #566

lukehinds opened this issue Jan 13, 2025 · 11 comments

Comments

@lukehinds
Copy link
Contributor

lukehinds commented Jan 13, 2025

Introduce support for "Muxing" in CodeGate, enabling users to route specific types of logic to different large language models (LLMs) based on file type, individual files and workspace. This would provide cost savings , a common complaint from users is the expense of tokens and how to optimise those costs along with a way of isolation use to specific models (based on privacy, security , protection of IP)

It would also enable context switching (context window refers to the amount of text data a language model can consider at one time when generating responses) , whereby we could switch models during a long protracted session of prompt / response.

For example:

  • Per Workspace: A user can set a LLM choice for the
  • Per File Type: Documentation files (e.g., .md, .rst) could be processed by a free, local model, while source code files (e.g., .py, .js) are handled by an advanced model like Claude Sonnet 3.5.
  • Per File: Users could specify models for particular files within a project based on complexity or other criteria.

Why is this feature important?

  1. Granular Control: Developers can tune model usage based on file type or project needs.
  2. Resource Efficiency: Allocate lightweight or local models to simpler tasks, while reserving advanced models for more complex challenges, providing more cost efficiency.
  3. Workspace Integration: Aligns with the workflow of managing multiple workspaces

NOTE: As always, start small, simple and validate, the following acts as a guideline of where this could lead.

Possible Solution

Expand CodeGate’s functionality to include:

  1. Repository-Level Configuration:

    • Enable users to set model preferences for an entire workspace.
    • Example: Repository A -> local_model, Repository B -> advanced_model.
  2. File-Type-Based Routing:

    • Route tasks based on file extensions or MIME types.
    • Example: .md -> local_model, .py -> claude_sonnet_3.5.
  3. Per-File Routing:

    • Allow specific files in a workspace to be linked to a designated model.
    • Example: config.py -> local_model_v2.

Challenges and Considerations

  • Extensible: It's likely a lot of features will sit on top of muxing, and some synergies will be mapped to pipelines.
  • Latency and Performance: Optimize routing logic to prevent delays in large or multi-repository setups.
  • User Experience: Provide clear documentation and possibly a UI component for managing these configurations.
@lukehinds
Copy link
Contributor Author

Moving this to ready, as I think we can at least start prototyping and thinking about this, albeit we should land #454 first to really make it useful.

@jtenniswood
Copy link

Filter by file: can do support wildcards (eg doc*.txt), as well as specific file names?
Would we also want to know the directory structure too?

@jtenniswood
Copy link

Would we want to specify a version of a model, or something broader (eg claude_sonnet vs claude_sonnet_3.5).
What would happen if 3.5 wasn't available?

@jtenniswood
Copy link

Would users select their repo, or just type it in (eg can we see a list or is it just manual?)

@jtenniswood
Copy link

Would this be on a global level or per workspace?

@jhrozek
Copy link
Contributor

jhrozek commented Jan 18, 2025

Filter by file: can do support wildcards (eg doc*.txt), as well as specific file names?

I think it would be nice, I can envision being OK to send tests/database.sql but not prod/database.sql (or anything under prod)

Would we also want to know the directory structure too?

We would want to but I don't think it was easy. Tagging @aponcedeleonch and @JAORMX who were looking into this. I suspect it's not fully possible until codegate is hooked into the IDE, but I don't know 100%

@jhrozek
Copy link
Contributor

jhrozek commented Jan 18, 2025

(Similar to the other reply, this is my 2 öre, not a canonical reply)

Would we want to specify a version of a model, or something broader (eg claude_sonnet vs claude_sonnet_3.5).

This is a question about model aliases right_

What would happen if 3.5 wasn't available?

I think we could start with just being specific - in other words, just require the full model name to by typed. Many providers have aliases anyway, e.g. anthropic supports claude-3-5-sonnet-latest. OpenAI has the same. The "major versions" are usually different enough I would let the user type the version.

But I can also see that codegate having some logic to pick the best model might be good, e.g. "for simple requests, use anthropic haiku, for more complex changes use anthropic sonnet, for very complex tasks use anthropic opus".

@JAORMX
Copy link
Contributor

JAORMX commented Jan 18, 2025

We would want to but I don't think it was easy. Tagging @aponcedeleonch and @JAORMX who were looking into this. I suspect it's not fully possible until codegate is hooked into the IDE, but I don't know 100%

That's right, codegate can't know unless it's actually hooked into the IDE, which it isn't

@jtenniswood
Copy link

Image

@aponcedeleonch
Copy link
Contributor

The work on this issue has finished. The next enhancement is tracked in #1059

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants