source: "raw/articles/github-web-arena-xwebarena-infinity-an-approach-to-utomatically-generating-brows.md"

Summary: WebArena-Infinity - Automated Browser Environment Generation

TL;DR: WebArena-Infinity is an automated system that uses multi-agent coordination to generate realistic web applications with verifiable tasks for training and evaluating browser agents, replacing manual environment construction with scalable automation.

Key Points

  • Uses multi-agent system coordinating coding agents (privileged access) with browser agents (UI interaction) to generate environments
  • Generates from real-world artifacts like product manuals and workflows
  • Produces high-fidelity applications, difficulty-graded tasks, and programmatic verifiers
  • Supports 8+ agent models including GPT-4o, Claude Sonnet, Gemini variants, and computer-use agents
  • Requires Python 3.12+, uv package manager, and at least one API key (OpenAI, Anthropic, Google, etc.)
  • Full pipeline uses Claude Code CLI with --dangerously-skip-permissions flag for autonomous code generation
  • Ships with pre-built environments ready for immediate evaluation
  • Supports AWS deployment for parallel multi-environment generation using EC2 instances
  • Includes base AMI creation with pre-installed dependencies (Python, Node.js, Claude CLI, Playwright, Chromium)
  • Results stored in S3 with browsable HTML dashboard for cross-environment analysis
  • Environments are self-contained web apps with realistic seed data and UI interactions

Concepts Covered

Images and Figures

  • img-0.gif: Browser agents navigating generated environments (trajectory visualization)
  • img-1.png: Blog post badge/link
  • img-2.png: Environment hub badge/link
  • img-3.png: Dataset badge/link
  • img-4.png: GitHub repository badge/link

Related Concepts