Feature Selection

Summary: The process of selecting relevant input features for machine learning models to improve performance, reduce overfitting, and enhance interpretability. In reinforcement learning contexts, this extends to selecting decision-relevant concepts that preserve optimal decision-making while providing human-understandable representations.

Overview

Feature selection is a fundamental preprocessing step in machine learning that involves choosing a subset of relevant features from the original feature set. The goal is to remove irrelevant, redundant, or noisy features that may degrade model performance or interpretability. Traditional approaches include filter methods (statistical measures), wrapper methods (model-based evaluation), and embedded methods (built into the learning algorithm).

In the context of Interpretable Reinforcement Learning, feature selection takes the form of concept selection - choosing human-understandable concepts that are most relevant for decision-making. The Decision-Relevant Selection (DRS) algorithm represents a significant advancement by automatically selecting concepts that minimize Abstraction Error while preserving decision-making performance.

The key insight is that features or concepts are relevant if removing them would cause the model to confuse inputs that require different outputs or actions. This principle connects feature selection to State Abstraction theory, where states requiring the same optimal action can be grouped together without loss of performance.

Key Details

Computational Complexity: The concept selection problem is proven to be NP-hard, requiring approximation algorithms for tractable solutions
Performance Guarantees: DRS provides theoretical bounds showing that concept-based policies using decision-relevant concepts achieve near-optimal performance
Empirical Validation: DRS improves test-time intervention effectiveness by 40-87% across diverse environments including CartPole, MiniGrid, Pong, Boxing, and real-world glucose management
Automation Benefits: Can automatically recover manually curated concept sets while matching or exceeding their performance, eliminating the need for costly domain expertise
Measurement Metric: Uses Q-Distance to measure how concept removal affects the ability to distinguish between states requiring different optimal actions
Practical Implementation: Utilizes Mixed Integer Linear Programming for tractable approximation of the optimal concept selection

Relationships

Concept-Based Models — use selected concepts as interpretable intermediate representations
State Abstraction — provides theoretical foundation for grouping similar decision states
Interpretable Machine Learning — feature selection enhances model explainability and human understanding
Reinforcement Learning — extends traditional RL with interpretable decision pathways
Human-AI Interaction — selected features enable effective Test-Time Intervention
Concept Bottleneck Models — architectural framework that benefits from principled feature selection
Markov Decision Processes — mathematical foundation for understanding decision-relevant features
Policy Optimization — selected features must preserve optimal decision-making capabilities

Sources

sources/selecting-decision-relevant-concepts-in-reinforcement-learning — introduced DRS algorithm, theoretical foundations, and empirical validation across multiple domains