All Notes
-
-
Spanning Tree
a sub graph of a connected graph that contains all vertices, but no cycles
-
Bucket Sort
a distribution-based sorting algorithm that works by dividing elements into buckets
-
Temperature Scaling
a parameter that controls how confident Softmax predictions are
-
Few-Shot Knowledge-Distillation
Routes LLM tasks to cheaper or more powerful models based on task novelty.
-
Large Language Models are Zero-Shot Reasoners (May 2022)
improve zero-shot prompt performance of LLMs by adding “Let’s think step by step” before each answer
-
Neural Machine Translation by Jointly Learning to Align and Translate (Sep 2014)
improve the Encoder/Decoder alignment with an Attention Mechanism
-
Thinking LLMs: General Instruction Following with Thought Generation (Oct 2024)
a prompting and fine-tuning method that enables LLMs to engage in a "thinking" process before generating responses
-
Mixtral of Experts (Jan 2024)
a Sparse Mixture of Experts (SMoE) language model
-
Evaluation of OpenAI o1: Opportunities and Challenges of AGI
a comprehensive evaluation of o1-preview across many tasks and domains.