Home About

Tags

MachineLearning (26) LinearAlgebra (16) GameDesign (12) ComputerScience (11) SoftwareEngineering (11) LargeLanguageModels (9) AudioEngineering (7) DiscreteMath (6) AutomatedTesting (6) Roblox (5) Zettelkasten (5) AgenticReasoning (4) More

Notes by Lex Toumbourou

DeepSeek-R1-Zero

Jan 21, 2025 permanent

From the DeepSeek-R1 Reasoning via Reinforcement Learning paper, a model trained to reason without explicitly seeing reasoning traces.