The Phi models are the most capable and cost-effective Small Language Models(SLMs) available, outperforming models of the same size and the next size up across a variety of language, reasoning, coding, audio, vision, and math benchmarks. This release expands the selection of high-quality models for customers, offering more practical choices for composing and building generative AI applications.
The Phi Family started with Phi-1 for Python Code generation, continued to Phi-1.5 /2 based on text and chat completion, Phi-3-mini/small/medium-instruct and Phi-3.5/4-mini-instruct, and developed to Phi-3/3.5-vision for vision, Phi-4 based on strong reasoning, and Phi-3.5-MoE for MoE, and now the full-modal model Phi-4-multimodal. Through high-quality data sets, benchumark can be trained to be comparable to models with larger training parameters.
Model Card | Parameters | Coding | Text/Chat Completion | Advanced Reasoning | Vision | Audio | MoE |
---|---|---|---|---|---|---|---|
Phi-1 | 1.3B | YES | NO | NO | NO | NO | NO |
Phi-1.5 | 1.3B | YES | YES | NO | NO | NO | NO |
Phi-2 | 2.7B | YES | YES | NO | NO | NO | NO |
Phi-3-mini-4k-instruct Phi-3-mini-128k-instruct |
3.8B | YES | YES | NO | NO | NO | NO |
Phi-3-small-8k-instruct Phi-3-small-128k-instruct |
7B | YES | YES | NO | NO | NO | NO |
Phi-3-mediumn-4k-instruct Phi-3-mediumn-128k-instruct |
14B | YES | NO | NO | NO | NO | NO |
Phi-3-vision-instruct | 4.2B | YES | YES | NO | NO | NO | NO |
Phi-3.5-mini-instruct | 3.8B | YES | YES | NO | NO | NO | NO |
Phi-3.5-MoE-instruct | 42B | YES | YES | NO | NO | NO | YES |
Phi-3.5-vision-128k-instruct | 4.2B | YES | YES | NO | YES | NO | NO |
Phi-4 | 14B | YES | YES | YES | NO | NO | NO |
Phi-4-mini | 3.8B | YES | YES | YES | NO | NO | NO |
Phi-4-multimodal | 5.6B | YES | YES | YES | YES | YES | NO |
Customer Need | Task | Start with | More Details |
Need a model that simply summarizes a thread of messages | Conversation Summarization | Phi-3 / 3.5 text model | Deciding factor here is that the customer has a well defined and straight forward language task |
A free math tutor app for kids | Math and Reasoning | Phi-3 / 3.5 / 4 text models | Because the app is free customers want a solution that does not cost them on a recurring basis |
Self Patrol Car Camera | Vision analysis | Phi-3 /3.5 -Vision or Phi-4-multimodal | Need a solution that can work on edge without internet |
Wants to build an AI based travel booking agent | Needs complex planning, function calling and orchestration | GPT models | Need ability to plan, call APIs to gather information and execute |
Wants to build a copilot for their employees | RAG, multiple domain, complex and open ended | GPT models + Phi Family | Open ended scenario, needs broader world knowledge, hence a larger model is more suited. You need to chunking the knowledge content, maybe SLM is good for you |