Merkle Tree
a data structure where each node contains the hash of its child nodes
a data structure where each node contains the hash of its child nodes
a sub graph of a connected graph that contains all vertices, but no cycles
a distribution-based sorting algorithm that works by dividing elements into buckets
a parameter that controls how confident Softmax predictions are
Routes LLM tasks to cheaper or more powerful models based on task novelty.
an approach to utilising LLMs that involve multi-state interactions.
A data visualization that uses squares along a 2D grid for representing proportion.
The specific self-attention formulation from the Transformer paper, distinguished by scaling scores by the square root of the attention dimension.