source: "raw/articles/autowebworld-synthesizing-infinite-verifiable-web-environments-via-finite-state-.md"

Summary: AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines

TL;DR: A framework that generates synthetic web environments using Finite State Machines (FSMs) to create verifiable GUI training data at scale, enabling programmatic verification and achieving state-of-the-art performance with only $0.04 per trajectory.

Key Points

  • Proposes AutoWebWorld framework that models web environments as FSMs to enable intrinsic verification without external judges
  • Synthesized 11,663 verified GUI trajectories across 29 websites at $0.04 per trajectory (vs. $0.15-$1.00 for existing methods)
  • Achieves 27.42% success rate on WebVoyager benchmark, outperforming baselines trained on datasets orders of magnitude larger
  • Uses 4-step process: FSM generation, web environment synthesis, BFS trajectory collection, and execution-based filtering
  • Training data contains only ~16K steps but demonstrates clear scaling laws - performance improves consistently as synthetic data volume increases
  • Average trajectory length of 21.9 steps exceeds existing datasets (6.9-12.1 range)
  • Multi-agent FSM generation uses GPT-5.1 with validator-driven loops for quality assurance
  • Coding agents (Gemini3-Pro) translate FSMs into executable Vue.js websites
  • BFS exploration over FSM state graphs ensures shortest paths and systematic coverage
  • Released 29 diverse web environments spanning commerce, productivity, media, health, communication domains

Concepts Covered

Images and Figures

  • Figure 1: Comparison flowchart showing traditional vs. AutoWebWorld trajectory collection approaches
  • Figure 2: Four-step AutoWebWorld generation process diagram
  • Figure 3: Example verified GUI trajectory for GitHub repository creation
  • Figure 4: Scaling curves showing performance improvements on WebVoyager and Online-Mind2Web
  • Figure 5: Ablation study showing importance of grounding data in GRPO training
  • Figures 6-7: Case study trajectory examples from synthesized websites

Related Concepts