Concept Selection

Summary: Automated process of choosing an optimal subset of interpretable concepts from a candidate bank to maximize decision-making performance while maintaining interpretability. Originally developed for reinforcement learning through the Decision-Relevant Selection algorithm, which minimizes Abstraction Error to preserve decision structure.

Overview

Concept Selection addresses a fundamental challenge in Interpretable Machine Learning: how to automatically identify the most relevant concepts from a large pool of candidates without requiring manual curation or domain expertise. The problem is computationally hard (NP-complete) but critical for deploying interpretable models at scale.

The core insight is that concepts should be selected based on their decision-relevance - their ability to distinguish between states or inputs that require different optimal actions or decisions. This contrasts with traditional Feature Selection approaches that focus on predictive accuracy rather than decision quality.

In the reinforcement learning context, Concept Selection works by evaluating how well different concept subsets preserve the underlying decision structure of the original state space. The Decision-Relevant Selection (DRS) algorithm accomplishes this by:

  1. Computing Q-Distance between states to measure decision similarity
  2. Selecting concepts that minimize Abstraction Error when states are grouped by concept values
  3. Using Mixed Integer Linear Programming for tractable approximation of the NP-hard optimization

Key Details

  • Performance Guarantees: Provides theoretical bounds showing that policies using decision-relevant concepts achieve near-optimal performance compared to policies with full state information
  • Empirical Results: Demonstrates 40-87% improvement in Test-Time Intervention effectiveness across CartPole, MiniGrid, Pong, Boxing, and glucose management tasks
  • Concept Recovery: Can automatically recover manually curated concept sets while matching or exceeding their performance on CUB dataset
  • Computational Complexity: The general concept selection problem is NP-hard, but DRS provides polynomial-time approximation algorithms
  • Scalability: Works with concept banks containing hundreds of candidate concepts, automatically selecting 5-15 most decision-relevant ones

The approach fundamentally changes how interpretable models are built - moving from manual concept engineering to automated, principled selection with performance guarantees.

Relationships

Sources