Joint Surrogate Learning of Objectives, Constraints, and Sensitivities for Efficient Multi-objective Optimization of Neural Dynamical Systems

cs.LG Frithjof Gressmann, Ivan Georgiev Raikov, Seung Hyun Kim, Mattia Gazzola, Lawrence Rauchwerger, Ivan Soltesz · Mar 22, 2026
Local to this browser
What it does
Multi-objective optimization of expensive biophysical neural simulations is hindered by high-dimensional parameter spaces and binary constraints that partition the search space without gradient signals. This paper introduces dmosopt, a...
Why it matters
This paper introduces dmosopt, a framework that jointly learns objectives, constraints, and parameter sensitivities in a single differentiable surrogate model $f: \mathbb{R}^n \rightarrow \mathbb{R}^{q+k}$. By computing a unified gradient...
Main concern
The joint surrogate approach is empirically effective and well-motivated for expensive, constrained optimization problems. The framework delivers convincing speedups over Gaussian process baselines on neuroscience benchmarks, particularly...
Community signal
0
0 up · 0 down
Sign in to vote with arrows
AI Review AI reviewed
Plain-language introduction

Multi-objective optimization of expensive biophysical neural simulations is hindered by high-dimensional parameter spaces and binary constraints that partition the search space without gradient signals. This paper introduces dmosopt, a framework that jointly learns objectives, constraints, and parameter sensitivities in a single differentiable surrogate model $f: \mathbb{R}^n \rightarrow \mathbb{R}^{q+k}$. By computing a unified gradient $\mathbf{g}_{\text{sopt}}$ that simultaneously steers toward improved objective values and greater constraint satisfaction, the method navigates feasibility manifolds that defeat standard approaches, achieving substantial speedups on problems ranging from single-cell models to million-neuron networks.

Critical review
Verdict
Bottom line

The joint surrogate approach is empirically effective and well-motivated for expensive, constrained optimization problems. The framework delivers convincing speedups over Gaussian process baselines on neuroscience benchmarks, particularly in navigating highly constrained regions where random sampling finds zero feasible solutions. However, the evaluation is tempered by the lack of replication for the largest-scale experiment and the absence of calibrated uncertainty quantification in the neural surrogate, which forces reliance on evolutionary operators for exploration.

What holds up

The core mechanism of joint gradient-based feasibility solving is convincingly demonstrated. In the motoneuron benchmark with widened parameter bounds, three constraints yield "exactly 0% feasibility" under Monte Carlo sampling, yet the constraint surrogate gradient $\nabla_{\mathbf{x}}f_{\mathbf{c}}$ guides the optimizer into feasible regions, achieving rapid hypervolume convergence where standard methods fail (Figure 4A). The scaling demonstration on a 836,970-neuron hippocampal network shows practical utility, reducing computational cost by $2\times$ to $5\times$ relative to GP baselines while reaching comparable or better solution quality (Figure 5B, C). The free sensitivity estimates from partial derivatives $\partial f / \partial x_j$ match established methods (FAST, DGSM) without additional simulation budget (Figure 3C).

“Monte Carlo sampling... three constraints - first inter-spike interval (ISI), ISI adaptation, and monotonic frequency-current (F-I) relationship - yield 0% feasibility, preventing random sampling from discovering any valid solution.”
paper · Figure 4A caption
“Due to the extreme computational cost (~60–300 CPU-days per run), the network optimization was run only once per surrogate method.”
paper · Section 4.8
Main concerns

The comparison to Gaussian process baselines is methodologically uneven: GP surrogates are trained "exclusively on feasible samples" whereas neural networks "benefit from training on all available data, including infeasible samples" (Section 4.6), confounding the assessment of joint learning benefits. The large-scale network results lack statistical replication (n=1 per method), making robustness claims speculative despite the impressive speedup. The neural network surrogate lacks calibrated uncertainty estimates, creating a risk of overfitting to early training data that could direct search toward "regions where the model is confidently wrong"; while mitigated via cross-validated epoch selection and elite preservation, the paper notes this remains a limitation.

“GP surrogates... are trained exclusively on feasible samples... The neural network surrogates, by contrast, learn objectives and constraints jointly and therefore benefit from training on all available data, including infeasible samples.”
paper · Section 4.6
“The surrogate may overfit to early training data, directing the search toward spurious optima... we mitigate this via cross-validated epoch selection, periodic retraining, and elite preservation... but more principled trust-region strategies could further improve robustness.”
paper · Section 3
Evidence and comparison

The evidence supports the claim that joint learning improves optimization quality across the single-cell benchmarks (9 CA1 interneuron types). The joint c+o-FT-Transformer achieves the best mean rank for final solution quality (IGD), demonstrating that "shared learning of objectives and constraints produces better Pareto fronts" compared to objective-only architectures and GP baselines (Figure 3B). The comparison reveals a trade-off: objective-only models converge faster early (higher HV-AUC), while joint models ultimately yield superior feasibility and solution quality. The sensitivity-informed sampling ablation supports the utility of surrogate-derived gradients, though the improvement over standard NSGA-II is modest.

“For final solution quality (IGD), the joint c+o-FT-Transformer achieved the best mean rank across populations, demonstrating that shared learning of objectives and constraints produces better Pareto fronts.”
paper · Section 2.2
Reproducibility

The framework is open-source (github.com/dmosopt/dmosopt) with comprehensive documentation, and all benchmark specifications are detailed in the Supplementary Materials (Section S.16). Hyperparameters for the FT-Transformer and ResNet surrogates are tabulated (Table 2), including adaptive scaling rules for high-dimensional targets. However, reproduction of the primary network result requires supercomputing resources (~60–300 CPU-days per method, Section 4.8), creating a significant barrier to independent verification. The code supports MPI distribution and lists exact software versions (TensorFlow 2.16.2, NEURON 8.2.6), though some implementation details (e.g., automatic epoch selection timeouts) rely on heuristics that may affect reproducibility across hardware.

“All code to reproduce the presented experiments is publicly available from GitHub repositories... dmosopt can be installed via pip... The notebooks to reproduce the experiments and figures in this paper can be found at github.com/GazzolaLab/MultiObjectiveSurrogateOptimization”
paper · Section 4.10
“The total computational budget across all experiments... was approximately 1,500 CPU-days.”
paper · Section 4.8
Abstract

Biophysical neural system simulations are among the most computationally demanding scientific applications, and their optimization requires navigating high-dimensional parameter spaces under numerous constraints that impose a binary feasible/infeasible partition with no gradient signal to guide the search. Here, we introduce DMOSOPT, a scalable optimization framework that leverages a unified, jointly learned surrogate model to capture the interplay between objectives, constraints, and parameter sensitivities. By learning a smooth approximation of both the objective landscape and the feasibility boundary, the joint surrogate provides a unified gradient that simultaneously steers the search toward improved objective values and greater constraint satisfaction, while its partial derivatives yield per-parameter sensitivity estimates that enable more targeted exploration. We validate the framework from single-cell dynamics to population-level network activity, spanning incremental stages of a neural circuit modeling workflow, and demonstrate efficient, effective optimization of highly constrained problems at supercomputing scale with substantially fewer problem evaluations. While motivated by and demonstrated in the context of computational neuroscience, the framework is general and applicable to constrained multi-objective optimization problems across scientific and engineering domains.

Challenge the Review

Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.

No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.