Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
OctoThinker 's Collections
Mid-training Analysis Checkpoints (Llama-3.2-3B)
OctoThinker-Llama-8B Family
OctoThinker-Llama-3B Family
OctoThinker-Llama-1B Family

OctoThinker-Llama-1B Family

updated Jul 6, 2025

What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.

Upvote
2

  • OctoThinker/OctoThinker-1B-Long-Base

    Text Generation • 1B • Updated Jul 6, 2025 • 7

  • OctoThinker/OctoThinker-1B-Hybrid-Base

    Text Generation • 1B • Updated Jul 6, 2025 • 4

  • OctoThinker/OctoThinker-1B-Short-Base

    Text Generation • 1B • Updated Jul 6, 2025 • 8

  • OctoThinker/OctoThinker-1B-Long-Zero

    Text Generation • 1B • Updated Jul 6, 2025 • 6

  • OctoThinker/OctoThinker-1B-Hybrid-Zero

    Text Generation • 1B • Updated Jul 6, 2025 • 3

  • OctoThinker/OctoThinker-1B-Short-Zero

    Text Generation • 1B • Updated Jul 6, 2025 • 2
Upvote
2
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs