neo-3

This is the 1B-A90M-Base model. Check out the 3B-A400M-Base model and the post-trained version of your model.

The neo-3-1B-A90M and 3B-A400M models are the successor to neo-2-345M-C1 and C2, featuring:

  • Pre-trained and post-trained models
  • Context windows of 8K for 1B and 32K for 3B
  • The first ever neo reasoning model
  • Mixtral architecture

The post-trained models will release January 3rd, 2026. The pre-trained models have released on December 31st, 2025. A technical report is on the works, but training data is fully available for replication.

This model has an MIT license.

Evaluations for base models

Performance

MMLU HellaSwag PIQA ARC avg GSM8K Avg.
neo-3-1B-A90M 32.7 52.3 63.4 42.1 2.2 38.54
neo-3-3B-A400M 42.1 59.9 67.5 50.6 5.7 45.16
SmolLM2-360M 32.7 52.3 63.4 42.1 2.2 38.54
Gemma 3 270M 26.5 40.9 67.7 43.4 1.1 35.92
Qwen3-0.6B-Base 44.0 55.3 60.9 52.4 49.7 52.46

Efficiency

Active Parameters Training Tokens Avg. p/ B active Avg. p/ T training tokens
neo-3-1B-A90M 120M 1.2T 321.17 32.12
neo-3-3B-A400M 380M 1.2T 118.84 37.63
SmolLM2-360M 362M 4T 106.46 9.64
Gemma 3 270M 268M 6T 134.03 5.99
Qwen3-0.6B-Base 642M 36T 81.72 1.46

Charts

image

Task list

  • Train neo-3-Base models
  • Train neo-3 Instruct/Thinking models
  • Release neo-3-Base models
  • Release neo-3 Instruct/Thinking models
  • Publish neo-3 Technical Report
  • Train neo-3-VL Base model
  • Train neo-3-VL Instruct model
  • Release neo-3-VL Base model
  • Release neo-3-VL Instruct model
  • Publish neo-3-VL Technical Report
  • Train neo-3.1-Base models
  • Train neo-3.1 Instruct/Thinking/VL-Instruct models
  • Release neo-3.1-Base models
  • Release neo-3.1 Instruct/Thinking/VL-Instruct models
  • Publish neo-3.1 Technical Report
Downloads last month
13
Safetensors
Model size
1.0B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train aquiffoo/neo-3-1B-A90M-Base

Collection including aquiffoo/neo-3-1B-A90M-Base