QLoRA (2023) | AI Timeline

What Happened

In May 2023, the QLoRA paper proposed fine-tuning approaches that significantly reduce memory requirements by training adapters on top of 4-bit quantized base models.

Why It Matters

QLoRA lowered the barrier for community fine-tuning and experimentation on large open models, enabling more development on consumer and mid-range GPUs.

Technical Details

QLoRA combines quantized base weights (e.g., 4-bit formats) with trainable low-rank adapters, often paired with optimizer and memory-management techniques to handle training-time spikes.

Sources

Paper (arXiv)
Reference code