← Library
source: "raw/articles/github-web-arena-xwebarena-infinity-an-approach-to-utomatically-generating-brows.md"
Summary: WebArena-Infinity - Automated Browser Environment Generation
TL;DR: WebArena-Infinity is an automated system that uses multi-agent coordination to generate realistic web applications with verifiable tasks for training and evaluating browser agents, replacing manual environment construction with scalable automation.
Key Points
- Uses multi-agent system coordinating coding agents (privileged access) with browser agents (UI interaction) to generate environments
- Generates from real-world artifacts like product manuals and workflows
- Produces high-fidelity applications, difficulty-graded tasks, and programmatic verifiers
- Supports 8+ agent models including GPT-4o, Claude Sonnet, Gemini variants, and computer-use agents
- Requires Python 3.12+, uv package manager, and at least one API key (OpenAI, Anthropic, Google, etc.)
- Full pipeline uses Claude Code CLI with
--dangerously-skip-permissionsflag for autonomous code generation - Ships with pre-built environments ready for immediate evaluation
- Supports AWS deployment for parallel multi-environment generation using EC2 instances
- Includes base AMI creation with pre-installed dependencies (Python, Node.js, Claude CLI, Playwright, Chromium)
- Results stored in S3 with browsable HTML dashboard for cross-environment analysis
- Environments are self-contained web apps with realistic seed data and UI interactions
Concepts Covered
- Multi-Agent Systems — coordinated coding and browser agents for environment generation
- Browser Automation — agent evaluation through UI interactions
- Reinforcement Learning — environments optimized for RL training
- Web Application Generation — automated creation of realistic web apps from documentation
- Task Verification — programmatic verifiers for generated tasks
- AWS Infrastructure — scalable cloud deployment with EC2 and S3
- Computer Vision Agents — coordinate-based interaction models
- Environment Benchmarking — standardized evaluation frameworks for browser agents
Images and Figures
img-0.gif: Browser agents navigating generated environments (trajectory visualization)img-1.png: Blog post badge/linkimg-2.png: Environment hub badge/linkimg-3.png: Dataset badge/linkimg-4.png: GitHub repository badge/link