Back to timeline

LoRA: Low-Rank Adaptation of Large Language Models

LoRA introduces a parameter-efficient fine-tuning technique by injecting low-rank adapters into frozen pretrained models.

Architecture

What Happened

In June 2021, the LoRA paper proposed adapting large pretrained models by adding small low-rank trainable matrices while keeping base weights frozen.

Why It Matters

LoRA became a practical enabler for broad community fine-tuning and rapid iteration—especially for large transformer models—by reducing memory and compute barriers.

Technical Details

LoRA decomposes weight updates into low-rank factors (rank << dimension), achieving many fine-tuning benefits with far fewer trainable parameters.