topic connectionist ★ seed

Transformer Architecture (2017)

Vaswani et al.'s 'Attention Is All You Need' (2017) replaced recurrence with self-attention. Enabled massive parallelization and scaling, becoming the foundation of modern AI.

#transformer #self-attention #vaswani #2017

Sub-topics

BERT (2018) concept

Google's Bidirectional Encoder Representations from Transformers (2018). Pre-trained on masked language modeling, it set new benchmarks across 11 NLP tasks.

GPT-1 (2018) concept

OpenAI's first Generative Pre-trained Transformer (2018). Demonstrated that unsupervised pre-training on text followed by fine-tuning could achieve strong NLP performance.

GPT-3 (2020) concept

OpenAI's 175-billion parameter model (2020) demonstrated emergent few-shot learning abilities. Showed that scaling language models yields qualitatively new capabilities.

GPT-4 (2023) concept

OpenAI's multimodal model released March 2023. Accepts text and images, passes professional exams (bar exam 90th percentile), and demonstrates broad reasoning abilities.

Vision Transformer / ViT (2020) concept

Dosovitskiy et al. (2020) applied transformers directly to image patches, proving that pure attention without convolutions matches or beats CNNs on image classification.