Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Request: DeepScaleR-1.5B-Preview #668

Open
loretoparisi opened this issue Feb 17, 2025 · 0 comments
Open

Model Request: DeepScaleR-1.5B-Preview #668

loretoparisi opened this issue Feb 17, 2025 · 0 comments

Comments

@loretoparisi
Copy link

Add support to Berkeley DeepScaleR-1.5B-Preview
Reason: all benchmarks show how this 1.5B model performs better at o1 level in several benchmarks and for its size it is suitable to run on-device.

For what concerns model's weights:

  • Total parameters is 1.77B, bits is FP32, file size is 7GB
  • Quantization: we quantized at FP16 bits having 3.3GB weight files.

More in details:

  • Base Model: Fine-tuned from DeepSeek-R1-Distill-Qwen-1.5B
  • Dataset: Approximately 40,000 unique problem-answer pairs compiled from AIME, AMC, Omni-MATH, and Still datasets, including data processing steps
    . Reward Function: Binary system (1 for correct answers, 0 for incorrect or improperly formatted answers)
  • Training Steps via RL with GRPO
    Phase 1: 8K Context length (8 samples per prompt)
    Phase 2: 16K Context length (16 samples per prompt)
    Phase 3: 24K Context length

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant