Mixed Integer Linear Programming

Summary: Mathematical optimization technique that extends linear programming by allowing some variables to take integer values, enabling exact solutions to combinatorial optimization problems. Provides principled frameworks for discrete decision-making with performance guarantees, particularly effective for Decision-Relevant Concepts selection in reinforcement learning where binary choices must be optimized alongside continuous objectives.

Overview

Mixed Integer Linear Programming (MILP) combines the continuous optimization capabilities of linear programming with discrete decision-making through integer constraints. This hybrid approach makes it uniquely powerful for problems requiring binary or discrete choices while optimizing continuous objective functions - a common structure in machine learning and AI applications.

The technique transforms combinatorial problems into mathematically tractable optimizations with theoretical guarantees. Unlike heuristic approaches, MILP provides exact solutions when feasible and bounded approximations otherwise. Integer constraints ensure variables representing discrete decisions (like concept selection) remain binary, while linear constraints encode domain-specific requirements.

In concept selection for interpretable AI, MILP enables principled optimization of the trade-off between model complexity and performance. The Decision-Relevant Selection (DRS) algorithm exemplifies this application, using MILP to minimize Abstraction Error while ensuring selected concepts distinguish between states requiring different optimal actions. This formulation guarantees that the resulting concept-based policies maintain predictable performance relationships to optimal policies.

Key Details

Problem Structure: Combines continuous linear objective functions with discrete integer constraints on decision variables
DRS Formulation: O(n_d² + K) variables where n_d represents distinct abstract states and K candidate concepts
Objective Function: Minimizes abstraction error ε(g_c) = max_{s,s': g(s)=g(s')} max_a |Q*(s,a) - Q*(s',a)|
Performance Guarantee: Theoretical bound V^π*(s) - V^π_c*(s) ≤ 2ε(g_c)/(1-γ)² relating abstraction error to value function loss
Computational Complexity: NP-hard in general but tractable for practical applications due to environmental constraints limiting effective state space
DRS-log Extension: Probabilistic variant incorporating uncertainty in concept predictors through modified MILP constraints
Empirical Results: 159% performance improvement over baselines in CartPole, successful scaling to 112 candidate concepts across multiple domains
Solver Integration: Compatible with standard commercial and open-source MILP solvers (Gurobi, CPLEX, CBC)
Separation Constraints: Ensures states with identical concept representations share the same optimal action to preserve decision structure
Automatic Recovery: Successfully recovers manually curated concept sets while matching or exceeding performance

Relationships

Decision-Relevant Concepts — MILP provides the optimization framework enabling principled, automated concept selection with performance guarantees
State Abstractions — Mathematical theory underlying MILP objective functions that preserve decision-making structure through approximate state abstraction
Abstraction Error — Core quantity minimized through MILP optimization to ensure concept quality and maintain optimal decision structure
Concept-Based Models — MILP-selected concepts improve interpretable model reliability and human-AI interaction effectiveness
Test-Time Intervention — Well-selected concepts via MILP enhance human ability to correct AI predictions during deployment with same effort yielding better performance
Feature Selection — MILP offers principled alternative to heuristic selection methods with theoretical performance bounds
Reinforcement Learning Interpretability — Enables automated concept selection for interpretable RL policies without manual domain expertise requirements
Human-AI Collaboration — MILP-optimized concept sets improve human understanding and correction capabilities in AI systems
Combinatorial Optimization — MILP provides exact solution methods for discrete optimization problems in AI applications

Sources

sources/selecting-decision-relevant-concepts-in-reinforcement-learning — Introduced MILP formulation for DRS algorithm with theoretical guarantees, performance bounds, and empirical validation across CartPole, MiniGrid, Pong, Boxing, and healthcare environments