Gemini 1.5 (2024) | AI Timeline

What Happened

Google DeepMind released Gemini 1.5 Pro, featuring a context window of up to 1 million tokens — a massive leap from the typical 32K–128K context lengths available at the time. The model could process approximately 1 hour of video, 11 hours of audio, over 30,000 lines of code, or 700,000 words in a single prompt while maintaining strong performance across the entire context.

Why It Matters

The 1 million token context window was a paradigm shift for how AI models could be used. Instead of chunking and summarizing documents, users could feed entire books, codebases, or video recordings directly into the model. This enabled new use cases like full-repository code analysis, long-form video understanding, and comprehensive document QA without retrieval-augmented generation. Later expanded to 2 million tokens in research, it set a new standard for context handling.

Technical Details

Architecture: Mixture-of-Experts (MoE) Transformer — uses sparse activation so only a subset of parameters are active for each token, improving efficiency
Context window: 1 million tokens standard, 2 million tokens in research preview
"Needle in a haystack" performance: >99% recall across the full 1M token context for retrieving embedded facts
Efficiency: Despite the massive context, maintained near-perfect retrieval and strong reasoning performance
Multimodal long-context: Could process long video and audio natively, not just text
Deployment: Available via Google AI Studio and Vertex AI

What Happened

Why It Matters

Technical Details

Sources