DeepSeek-R1-Zero
From the DeepSeek-R1 Reasoning via Reinforcement Learning paper, a model trained to reason without explicitly seeing reasoning traces.
From the DeepSeek-R1 Reasoning via Reinforcement Learning paper, a model trained to reason without explicitly seeing reasoning traces.