source: "raw/articles/github-karpathyautoresearch-ai-agents-running-research-on-single-gpu-nanochat-tr.md"

Summary: Autoresearch - AI Agents for Autonomous ML Research

TL;DR: Karpathy's project gives AI agents a small LLM training setup to experiment autonomously overnight, modifying code and hyperparameters to improve model performance within fixed 5-minute training budgets.

Key Points

  • AI agents autonomously modify train.py (contains full GPT model, optimizer, training loop) while humans program program.md instruction files
  • Fixed 5-minute wall clock training budget per experiment, allowing ~12 experiments/hour or ~100 overnight
  • Uses validation bits per byte (val_bpb) as the optimization metric - lower is better and vocab-size independent
  • Built on simplified single-GPU implementation of nanochat training code
  • Requires single NVIDIA GPU (tested on H100), Python 3.10+, and uv package manager
  • Agent only touches one file (train.py) while prepare.py handles data prep and utilities (unchangeable)
  • Self-contained design with no external dependencies beyond PyTorch
  • Includes platform-specific forks for MacOS, Windows, and AMD systems
  • Designed for overnight autonomous research: "You wake up in the morning to a log of experiments and (hopefully) a better model"

Concepts Covered

Images/Figures

  • img-0.png: Teaser image showing progress visualization (referenced as progress.png in repository)

Related Concepts