NotesByLex.com

Home About

/

Recent Tags

6 Months of OpenClaw Jul 15, 2026
Completing A Computer Science Degree On Coursera Jul 5, 2026
Shipping a Laptop to a Refugee Camp in Uganda May 23, 2026
Heavy Thinking: A Test-Time Scaling Pattern for Hard Problems May 17, 2026
LLMs Corrupt Your Documents When You Delegate May 9, 2026
AI-Induced Cognitive Atrophy May 7, 2026
OpenGame: Open Agentic Coding for Games Apr 29, 2026
Naming Things Is Easy Now Apr 25, 2026
Obsidian Markdown Notebook: code execution with outputs stored in the file Apr 21, 2026
Research, Plan, Implement Workflow Mar 22, 2026
See more →

MachineLearning 29
SoftwareEngineering 17
LinearAlgebra 16
LargeLanguageModels 13
GameDesign 12
AgenticReasoning 11
ComputerScience 11
AudioEngineering 7
LearningAndTeaching 6
Zettelkasten 6
DiscreteMath 6
AutomatedTesting 6
ReinforcementLearning 5
Roblox 5
DataStructures 4
Lua 4
StoryWriting 4
GameMath 4
Obsidian 3
LearningWithAI 3
See all 102 topics →

Home /permanent

DeepSeek-R1-Zero

Jan 21, 2025

From the DeepSeek-R1 Reasoning via Reinforcement Learning paper, a model trained to reason without explicitly seeing reasoning traces.