Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Optimising computation at the token-level
Optimising computation at the token-level
Experiments with multi-turn character consistent editing
a few comparisons of Google's Imagen 4 vs OpenAI's gpt-image-1
an alternative training method to backprop that does local layer learning
learn to reason without any human-annotated data.
a classic paper applying neural networks to RL for game playing