reference/papers category - Notes by Lex

Articles in the reference/papers subcategory

«
1
2
»

Learning to Reason without External Rewards

May 28, 2025 reference/papers ReinforcementLearning RewardModeling LargeLanguageModels

aka Self-Confidence is All You Need

Read More
Vibe-Coding Mathematical Discoveries

May 18, 2025 reference/papers AgenticReasoning LargeLanguageModels EvolutionaryAlgorithms

Using evolutionary algorithms with LLM-coding agents

Read More
NoProp: Training Neural Networks Without Back-Propagation or Forward-Propagation

May 15, 2025 reference/papers MachineLearning

an alternative training method to backprop that does local layer learning

Read More
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

May 12, 2025 reference/papers ReinforcementLearning ReasoningModels LargeLanguageModels

learn to reason without any human-annotated data.

Read More
Playing Atari with Deep Reinforcement Learning

May 05, 2025 reference/papers ReinforcementLearning GamePlayingAI

a classic paper applying neural networks to RL for game playing

Read More
Large Language Models are Zero-Shot Reasoners (May 2022)

Jan 08, 2025 reference/papers LargeLanguageModels PromptingTechniques

improve zero-shot prompt performance of LLMs by adding “Let’s think step by step” before each answer

Read More
Neural Machine Translation by Jointly Learning to Align and Translate (Sep 2014)

Oct 28, 2024 reference/papers AttentionMechanism

improve the Encoder/Decoder alignment with an Attention Mechanism

Read More
Thinking LLMs: General Instruction Following with Thought Generation (Oct 2024)

Oct 16, 2024 reference/papers AgenticReasoning System2Prompting

a prompting and fine-tuning method that enables LLMs to engage in a "thinking" process before generating responses

Read More
Mixtral of Experts (Jan 2024)

Oct 15, 2024 reference/papers

a Sparse Mixture of Experts (SMoE) language model

Read More
Evaluation of OpenAI o1: Opportunities and Challenges of AGI

Oct 14, 2024 reference/papers AgenticReasoning LargeLanguageModels

a comprehensive evaluation of o1-preview across many tasks and domains.

Read More

«
1
2
»