Learning-to-Optimize Framework for Mixed-Integer Quadratic Programming

Overview

This work proposes a learning-to-optimize (L2O) framework for accelerating the solution of parametric MIQP problems by learning structured solution components directly from data. The key idea is to predict high-quality integer decisions using a neural network, while preserving exact continuous optimality by solving a differentiable quadratic programming (QP) layer conditioned on the predicted integers. By explicitly separating discrete and continuous variables, the framework leverages problem structure and improves both feasibility and performance.

To train the model, we introduce a hybrid loss function that combines:

a supervised loss, encouraging predicted integer solutions to match globally optimal ones when labels are available, and
a self-supervised loss, derived directly from the MIQP objective and constraints, promoting feasibility and consistency even without labeled solutions.

This hybrid learning strategy bridges the gap between purely supervised and purely self-supervised approaches. The effectiveness of the proposed method is demonstrated on benchmark MI-MPC problems, where it achieves significant computational speedups while maintaining strong feasibility and near-optimal control performance.

Hybrid learning-to-optimize framework

Results

We evaluate the proposed hybrid L2O framework on two representative MI-MPC benchmarks:

Robot navigation with collision avoidance, where binary variables encode logical collision-avoidance constraints and interact strongly with continuous states and inputs.
Thermal energy tank control, where integer variables directly appear in the objective function and govern operational modes.

We compare three approaches:

Hybrid L2O (H-L2O) — proposed method
Supervised learning (SL) — trained only using labeled optimal integer solutions
Self-supervised learning (SSL) — trained using objective- and constraint-based losses only

Performance is evaluated using constraint violation rates and optimality gaps relative to the globally optimal solution.

Discussion: Overall, the results demonstrate that the proposed hybrid learning-to-optimize (L2O) framework achieves a strong balance between feasibility and optimality across both benchmarks. These results highlight that combining supervised and self-supervised objectives is crucial for producing solutions that are both feasible and near-optimal in mixed-integer control problems.

Robot navigation comparison

An example of robot navigation under mixed-integer MPC using L2O method.

Animation of robot navigation with obstacle avoidance.

Thermal Energy Tank Results

Thermal energy tank comparison

Computational Efficiency

Solving time comparison

Across both benchmarks, the L2O-based approaches are significantly faster than a state-of-the-art MIQP solver. In particular, the proposed framework achieves approximately:

9× speedup for the robot navigation problem, and
12× speedup for the thermal energy tank problem.

These results demonstrate the suitability of the proposed hybrid L2O framework for real-time mixed-integer MPC applications.

Publications

A Hybrid Learning-to-Optimize Framework for Mixed-Integer Quadratic Programming
Viet-Anh Le, Mu Xie, Rahul Mangharam
8th Annual Learning for Dynamics \& Control Conference (L4DC), 2026
arXiv preprint

Contributors

Viet-Anh Le, Mu Xie, Rahul Mangharam

Code: GitHub Repository

Citation


@inproceedings{le2026hybrid,
  title={A Hybrid Learning-to-Optimize Framework for Mixed-Integer Quadratic Programming}, 
  author={Le, Viet-Anh and Xie, Mu and Mangharam, Rahul},
  year={2026},
  booktitle={8th Annual Learning for Dynamics \& Control Conference},
}