MAGPI: Multifidelity-Augmented Gaussian Process Inputs for Surrogate Modeling from Scarce Data
Multifidelity surrogate modeling aims to leverage cheap low-fidelity simulations to improve predictions of expensive high-fidelity models when training data is scarce. This paper proposes MAGPI, a Gaussian process regression method that augments the high-fidelity input space with features derived from recursively-trained low-fidelity surrogate models. The approach unifies desirable properties from cokriging and autoregressive estimators while allowing non-GP models for low-fidelity levels, achieving superior accuracy and computational efficiency.
The paper presents a well-motivated and technically sound approach to multifidelity Gaussian process regression. The core innovation—augmenting the high-fidelity input space with predictions from lower-fidelity surrogate models—effectively addresses the Markovian limitation of autoregressive methods while avoiding the cubic computational cost of cokriging. The theoretical guarantee of Proposition 1 provides a solid foundation, though it assumes optimal hyperparameter selection. However, the practical utility depends heavily on the quality of low-fidelity features, and the paper does not thoroughly investigate cases where low-fidelity models might be systematically biased or misleading.
The method's flexibility to use arbitrary regression models for low-fidelity data (not just GPs) is a significant practical advantage, demonstrated convincingly in the CFD example where K-Nearest Neighbors replace expensive GP training on tens of thousands of points. The sequential training procedure reduces the prohibitive $\mathcal{O}([\sum_{l=1}^K N_l]^3)$ cost of cokriging to $\mathcal{O}(N_1^3 + \sum_{l=2}^K \tau_{\text{train}}^{(l)})$, making it scalable to many fidelity levels with abundant low-fidelity data. The extrapolation experiment on laminar flame speed demonstrates the method's ability to generalize outside the high-fidelity training domain by leveraging the informative mean function structure that combines inputs and low-fidelity predictions.
While Proposition 1 guarantees the existence of hyperparameters achieving at least the marginal likelihood of the baseline, it does not guarantee that optimization will find them, nor does it account for the increased risk of overfitting when using flexible mean functions or many low-fidelity features. The method's reliance on the specific ordering of fidelities—despite claims that $y_2, \dots, y_K$ may be arbitrarily ordered—could propagate errors if lower-fidelity models are systematically biased or poorly calibrated, as the features are constructed recursively. Additionally, the paper notes "undesired oscillations in its predictive posterior" in the conclusion, suggesting potential instability in the kernel specification that is not fully resolved and could undermine reliability in safety-critical applications.
The theoretical analysis also assumes that the low-fidelity surrogate models provide useful information, but does not characterize how the method behaves when low-fidelity models are negatively correlated or orthogonal to the high-fidelity target, potentially adding noise rather than signal to the augmented inputs.
The empirical evaluation covers diverse scenarios including a synthetic 1D problem with nonlinear relationships (where high-fidelity is a product of medium and low-fidelity functions), a chemical kinetics extrapolation task requiring generalization outside the training temperature range, and a sparse interpolation CFD problem with substantial low-fidelity data (up to 58,000 points). The comparisons against Kennedy O'Hagan, NARGP, and single-fidelity kriging consistently favor MAGPI across RMSE, $R^2$, and log marginal likelihood metrics. However, the use of KNN approximations for the autoregressive baselines in the CFD example—necessitated by the computational infeasibility of training full GPs on large low-fidelity datasets—represents a deviation from canonical implementations, though it fairly illustrates the practical constraints that MAGPI is designed to overcome. The comparison would be strengthened by including modern deep multifidelity methods and sparse GP approximations as baselines.
The paper provides detailed pseudocode (Algorithms 1 and 2) and specifies hyperparameter optimization via ADAM gradient descent with ARD kernels, enabling algorithmic reproduction. However, no code repository, software versions, or specific random seeds are provided in the text, which would impede independent reproduction. The high-fidelity CFD data consists of only 45 training points selected from a specific spatial region, raising questions about sensitivity to training set selection and spatial distribution. The reliance on specialized chemical kinetics and CFD solvers (USC-II mechanism, LES/RANS simulations) without public datasets or standardized benchmarks further limits reproducibility, though this is typical for the application domain. The complexity analysis is thorough, but actual wall-clock timing comparisons are absent, making it difficult to assess the practical computational savings claimed.
Supervised machine learning describes the practice of fitting a parameterized model to labeled input-output data. Supervised machine learning methods have demonstrated promise in learning efficient surrogate models that can (partially) replace expensive high-fidelity models, making many-query analyses, such as optimization, uncertainty quantification, and inference, tractable. However, when training data must be obtained through the evaluation of an expensive model or experiment, the amount of training data that can be obtained is often limited, which can make learned surrogate models unreliable. However, in many engineering and scientific settings, cheaper \emph{low-fidelity} models may be available, for example arising from simplified physics modeling or coarse grids. These models may be used to generate additional low-fidelity training data. The goal of \emph{multifidelity} machine learning is to use both high- and low-fidelity training data to learn a surrogate model which is cheaper to evaluate than the high-fidelity model, but more accurate than any available low-fidelity model. This work proposes a new multifidelity training approach for Gaussian process regression which uses low-fidelity data to define additional features that augment the input space of the learned model. The approach unites desirable properties from two separate classes of existing multifidelity GPR approaches, cokriging and autoregressive estimators. Numerical experiments on several test problems demonstrate both increased predictive accuracy and reduced computational cost relative to the state of the art.
Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.
No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.