What Happened
OpenAI released Codex, a large language model fine-tuned on publicly available code from GitHub. Codex was the model powering GitHub Copilot and was also made available through the OpenAI API, allowing developers to build applications that translate natural language instructions into code across dozens of programming languages.
Why It Matters
Codex proved that large language models could be specialized for code generation with remarkable effectiveness. It could solve 28.8% of problems from the HumanEval benchmark (a set of hand-written Python programming problems) on a single attempt, and 72.3% with 100 samples. This established code generation as a key capability of large models and spurred development of code-focused AI tools across the industry.
Technical Details
- Architecture: GPT-3 variant (12B parameters) fine-tuned on 159GB of Python code from GitHub
- Proficiency: Most capable in Python, but functional in JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, Shell, and more
- Benchmark: Solved 28.8% of HumanEval problems in a single pass (vs. GPT-3's 0%)
- API: Released as a private beta API enabling natural language to code translation
- Capabilities: Code completion, code generation from docstrings, code explanation, bug fixing, and translating between programming languages