Self-Attention

Jan 30, 2024 permanent

Self-Attention, sometimes called intra-attention, is an Attention Mechanism technique concerned with representing an input sequence as a weighted-average of the other token representations in the sequence, based on how important tokens are to each other.

The common implementation of self-attention comes from the Transformer architecture: Scaled-Dot Product Attention

Tags

Notes by Lex Toumbourou

Self-Attention