Abstraction Error
Summary: A metric measuring the maximum difference in optimal Q-values between states that share identical concept representations in reinforcement learning. It quantifies how well concept-based state abstractions preserve the decision structure of the underlying Markov Decision Process, serving as the primary optimization criterion for automated concept selection algorithms that maintain performance guarantees.
Overview
Abstraction error is a fundamental metric in Concept-Based Models and State Abstraction theory that measures decision structure preservation across states. Formally defined as:
ε(g_c) = max_{s,s': g(s)=g(s')} max_a |Q(s,a) - Q(s',a)|**
Where:
- g_c is the concept predictor function mapping states to concept representations
- s and s' are states with identical concept representations under g_c
- Q*(s,a) is the optimal Q-value for taking action a in state s
The abstraction error captures the worst-case Q-value difference between any two states grouped together by the concept predictor. This directly implements the Decision-Relevant Concepts principle - states requiring different optimal actions should maintain distinct concept representations to preserve decision structure. Zero abstraction error ensures perfect preservation of decision structure, meaning no two states requiring different actions share concept representations.

Key Details
- Performance Guarantees: Provides theoretical bounds on policy performance loss with V^π*(s) - V^π_c*(s) ≤ 2ε/(1-γ)² where ε is abstraction error and γ is the discount factor
- Optimization Target: The Decision-Relevant Selection algorithm directly minimizes abstraction error through Mixed-Integer Linear Programming formulations to automatically select concept sets without manual domain expertise
- Computational Complexity: While concept selection to minimize abstraction error is NP-hard in general, environmental constraints limit the effective state space making practical solutions tractable
- Measurement Requirements: Computing exact abstraction error requires knowledge of optimal Q-values, making it primarily useful for algorithm design rather than runtime evaluation
- Empirical Validation: Lower abstraction error consistently correlates with better policy performance across diverse environments including CartPole, MiniGrid, Pong, Boxing, and healthcare glucose management
- Predictor Dependencies: Final performance depends on both minimizing abstraction error during concept selection and achieving high accuracy in concept prediction during deployment
- Extension to Imperfect Predictors: The DRS-log algorithm extends abstraction error minimization to scenarios with probabilistic concept predictors using separation constraints
- Automatic Recovery: DRS algorithm using abstraction error minimization automatically recovers manually curated concept sets while matching or exceeding their performance
- Intervention Effectiveness: Decision-relevant concepts selected by minimizing abstraction error improve Test-Time Intervention effectiveness by 40-87% across environments compared to baseline selection methods

Relationships
- Decision-Relevant Concepts — Abstraction error operationalizes the core principle that concepts should distinguish states requiring different optimal actions
- State Abstraction — Provides theoretical foundation for abstraction error as a quality metric for approximate state representations in reinforcement learning
- Q-Distance — Related metric measuring Q-value differences between states, with abstraction error focusing specifically on concept-grouped states
- Mixed-Integer Linear Programming — Enables tractable optimization formulations for minimizing abstraction error with combinatorial concept selection constraints
- Concept-Based Models — Abstraction error serves as the primary evaluation metric for concept quality in interpretable reinforcement learning architectures
- Test-Time Intervention — Lower abstraction error amplifies the effectiveness of human corrections to concept predictions by ensuring interventions target decision-relevant distinctions
- Markov Decision Processes — Abstraction error measures how well concept representations preserve the underlying MDP structure for optimal decision making
- Feature Selection — Concept selection algorithms use abstraction error rather than traditional feature importance metrics, focusing on decision preservation rather than prediction accuracy
- Interpretable Reinforcement Learning — Abstraction error bridges the gap between interpretability goals and performance guarantees in explainable RL systems
- Approximate State Abstraction — Abstraction error quantifies the quality of approximate abstractions by measuring decision structure preservation
- Policy Optimization — Provides performance bounds connecting abstraction error to final policy quality and value function preservation
Sources
- sources/selecting-decision-relevant-concepts-in-reinforcement-learning — Primary source defining abstraction error and demonstrating its central role in the Decision-Relevant Selection algorithm for automatic concept selection with theoretical performance bounds