Change the repository type filter
All
Repositories list
149 repositories
Llama-3.2-3B-Instruct
PublicMistral-7B-Instruct-v0.3
Public- Deploying a GPT-Neo model with Dynamic Batching where requests are dynamically batched. <metadata> collections: ["Dynamic Batching","HF Transformers"] </metadata>
- Deploy Tinyllama-1.1B using GGUF with vLLM, efficient inference for quantized LLMs by leveraging vLLM’s optimized backend to accelerate prompt processing and reduce memory usage. <metadata> gpu: A100 | collections: ["Using NFS Volumes", "vLLM"] </metadata>
- Open-source chatbot fine-tuned from LLaMA on 70K ShareGPT conversations, optimized for research and conversational tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
Zephyr-7B-Streaming
Public templateZephyr-7B model with Server-Sent Events (SSE) enables real-time streaming, delivering efficient, interactive updates for chat based applications. <metadata> gpu: A100 | collections: ["Streaming LLMs", "SSE Events"] </metadata>whisper-large-v3
Public templateFlagship speech recognition model for English, delivering state‑of‑the‑art transcription accuracy across diverse audio scenarios. <metadata> gpu: T4 | collections: ["CTranslate2"] </metadata>- A small language model delivering robust text generation, reasoning, and instruction following with efficient long-context comprehension. <metadata> gpu: T4 | collections: ["vLLM","Batch Input Processing"] </metadata>
- Bark text-to-speech with Server-Sent Events (SSE) streams real-time audio, providing interactive and efficient updates for chat-based applications. <metadata> gpu: A100 | collections: ["SSE Events"] </metadata>
- A transformer-based text-to-audio model generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>
- Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. Links to other models can be found in the index at the bottom.
- A State-Of-The-Art coder LLM, tailored for instruction-based tasks, particularly in code generation, reasoning, and repair. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
- A state-of-the-art model that segments and labels audio recordings by accurately distinguishing different speakers. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>
Llama-2-7b-hf
Public templateA 7B parameter model fine-tuned for dialogue, utilizing supervised learning and RLHF, supports a context length of up to 4,000 tokens. <metadata> gpu: A10 | collections: ["HF Transformers"] </metadata>- A 7B instruction-tuned language model that excels in following detailed prompts and effectively performing a wide variety of natural language processing tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
DreamShaper
Public templateA ControlNet model designed for Stable Diffusion, providing brightness adjustment for colorizing or recoloring images. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>- A mixture-of-experts, instruction-tuned variant of Phi-3.5, delivering efficient, context-aware responses across diverse language tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
- A quantized version of 13B fine-tuned model, optimized for dialogue use cases. <metadata> gpu: T4 | collections: ["HF Transformers","GPTQ"] </metadata>
- A 7B autoregressive language model by Mistral AI, optimized for efficient text generation and robust reasoning. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
RealVisXL_V4.0_Lightning
Public templateA lightweight, accelerated variant of RealVisXL V4.0, engineered for real‑time, high‑quality image generation with enhanced efficiency. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>SDXL-Lightning
Public templateA lightning-fast text-to-image generation model that generate high-quality 1024px images in a few steps. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>- A text-to-image model by Stability AI, renowned for generating high-quality, diverse images from text prompts. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>
Vicuna-13b-8k
Public templateA GPTQ‑quantized, 13‑billion‑parameter uncensored language model with an extended 8K context window, designed for dynamic, high‑performance conversational tasks. <metadata> gpu: T4 | collections: ["GPTQ"] </metadata>Codellama-34B
Public templateA 34B-parameter Python-specialized model for advanced code synthesis, completion, and understanding. <metadata> gpu: A100 | collections:["AWQ"] </metadata>- A 34B-parameter, Python-specialized model for advanced code synthesis, deploy with vLLM for efficient inference. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>