Web Application State Serialization

Summary: Methods for capturing and representing the current state of web applications, enabling automated agents, testing systems, and analysis tools to understand and interact with web UIs. These approaches range from pixel-based screenshots to structured DOM representations, each with distinct trade-offs for token efficiency and semantic richness.

Overview

Web application state serialization involves converting the dynamic state of a web interface into a format that can be processed by automated systems. Traditional approaches rely on visual snapshots (screenshots) with grounded elements marked by bounding boxes or overlays. However, DOM-based serialization offers advantages including better semantic understanding, precise element targeting through CSS selectors, and elimination of image preprocessing overhead.

The core challenge lies in managing the massive size of DOM snapshots, which can reach 1 million tokens compared to 1,000 tokens for GUI screenshots. This size disparity makes raw DOM serialization impractical for LLM-Based Interaction systems that have strict token limits.

Key Details

Serialization Methods:

  • GUI Snapshots: Screenshot-based with visual grounding cues, typically ~1,000 tokens
  • DOM Snapshots: Full HTML serialization, up to 1,000,000 tokens
  • Downsampled DOM: Algorithmically reduced while preserving UI features, ~1,000-10,000 tokens

DOM Downsampling Techniques:

  • Container Elements: Hierarchical merging based on depth ratios to preserve structural relationships
  • Content Elements: Translation to concise Markdown representation for readability
  • Interactive Elements: Preserved unchanged to enable direct programmatic targeting

Performance Characteristics:

  • Downsampled DOM achieves 67-73% success rates vs 65% for grounded GUI baseline
  • Hierarchy emerges as the most valuable UI feature for LLM interpretation
  • Image input provides minimal value - text-only approaches perform nearly as well

Technical Considerations:

Relationships

Sources