Back to timeline

Batch Normalization

Ioffe and Szegedy introduce Batch Normalization, improving training stability and speed for deep networks.

Architecture

What Happened

In 2015, Batch Normalization was proposed as a method to normalize intermediate activations during training, reducing sensitivity to initialization and learning-rate choices.

Why It Matters

BatchNorm became a standard component in many deep learning pipelines, enabling deeper networks and faster convergence, and helping make training large models more practical across tasks.

Technical Details

BatchNorm normalizes layer inputs using batch statistics and learns scale/shift parameters, typically improving gradient flow and allowing more aggressive optimization settings.