paper - NotesByLex.com

paper

Decomposing LLM-Judge Scores Into Yes/No Questions

Jul 26, 2026 paper LLMJudge LargeLanguageModels

An LLM-judge approach that brings interpretability and actionability to your scores.

Read More
Heavy Thinking: A Test-Time Scaling Pattern for Hard Problems

May 17, 2026 paper AgenticReasoning TestTimeScaling AgentSkills

Now we have GPT Pro at home

Read More
LLMs Corrupt Your Documents When You Delegate

May 09, 2026 paper AgenticReasoning LimitationsofLLMs

A large-scale study on long-horizon document tasks.

Read More
OpenGame: Open Agentic Coding for Games

Apr 29, 2026 paper AgenticReasoning GameDevelopment

An agentic framework for end-to-end game creation

Read More
Self-Generated Agent Context Files Don't Help Either

Feb 26, 2026 paper AgenticReasoning SoftwareEngineering AIAgents

Self-generated agent context files don't help.

Read More
Self-Generated Skills Don't Help

Feb 21, 2026 paper AgenticReasoning SoftwareEngineering

Curated skills boost agent performance by 16 points; self-generated ones don't help at all.

Read More
Generative Modelling via Drifting

Feb 11, 2026 paper GenerativeModelling ImageGeneration

A new paradigm for single-step generative modelling

Read More