SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
Benchmarking Agent Skills
Benchmarking Agent Skills
an alternative training method to backprop that does local layer learning
learn to reason without any human-annotated data.
a classic paper applying neural networks to RL for game playing
improve zero-shot prompt performance of LLMs by adding “Let’s think step by step” before each answer
improve the Encoder/Decoder alignment with an Attention Mechanism
a prompting and fine-tuning method that enables LLMs to engage in a "thinking" process before generating responses