Side Effect Detection

Summary: The process of identifying unintended consequences that emerge from agent actions during task execution. This involves distinguishing between primary task outcomes and secondary effects that may compromise system reliability or user safety.

Overview

Side Effect Detection is a critical component of Computer Use Agents evaluation that focuses on identifying unintended consequences of agent actions beyond the primary task objectives. While agents may successfully complete their assigned goals, they can simultaneously cause harmful or unexpected side effects that compromise system integrity, user privacy, or operational safety.

This detection process becomes particularly important in Trajectory Verification systems, where evaluators must assess not only whether an agent achieved its primary objective but also whether it caused collateral damage in the process. The Microsoft Research Universal Verifier system exemplifies this approach by incorporating side effect detection into its structured evaluation framework.

Key Details

  • Separation from Primary Goals: Side effects are distinct from main task failures and require separate evaluation criteria in Rubric Design
  • False Positive Mitigation: Advanced detection systems reduce false positive rates from 45%+ to 1-8% by properly distinguishing intended actions from unintended consequences
  • Process vs Outcome Integration: Process vs Outcome Rewards frameworks must account for side effects in both execution quality and final goal achievement
  • Hallucination Connection: Side effects often manifest through Hallucination Detection when agents misrepresent or fabricate their actual impact on systems
  • Screenshot Evidence: Screenshot Context Management enables verification of side effects by providing visual evidence of unintended system state changes
  • Controllable vs Uncontrollable: Detection systems must distinguish between side effects caused by agent actions versus environmental factors beyond agent control

Relationships

Sources