Multinoulli Extension: A Lossless Continuous Relaxation for Partition-Constrained Subset Selection
The paper tackles partition-constrained subset selection for 'close-to-submodular' objectives—specifically α-weakly DR-submodular and (γ,β)-weakly submodular functions—where existing distorted local-search methods suffer from prohibitive query complexity (˜O(1/ϵ^6)) and require prior knowledge of structural parameters. The authors propose the Multinoulli Extension (ME), a continuous relaxation that learns multinoulli priors for each partition block, enabling lossless rounding without submodularity assumptions. They develop offline (Multinoulli-SCG) and online (Multinoulli-OSCG/OSGA) algorithms achieving tight approximation guarantees with O(1/ϵ^2) query complexity and O(√T) regret, respectively.
The paper presents a novel and theoretically rigorous continuous relaxation framework. The Multinoulli Extension is well-motivated with solid proofs of differentiability and inheritance of weak submodularity properties. The parameter-free algorithms achieve significant improvements in query complexity over prior distorted local-search methods, and the online extensions represent technical advances. However, practical impact is tempered by reliance on high-variance stochastic gradient estimators and the need for non-oblivious auxiliary functions to achieve optimal approximation ratios, adding implementation complexity.
The Multinoulli Extension is rigorously developed: Theorem 1 establishes preservation of differentiability, monotonicity, and weak DR-submodularity (α-weakly), while Theorem 2 provides the crucial link between continuous and discrete objectives. The 'lossless rounding' claim is substantiated—Algorithm 1 ensures |S ∩ Vk| ≡ Bk and E[f(S)] ≥ F(x), avoiding submodularity requirements of multi-linear extension schemes (Section IV-C, Theorem 5). The query complexity improvement to O(r^3 n^2/ϵ^2) (Table I) is significant and supported by path-integrated differential estimators.
First, while query complexity is reduced, stochastic estimation of gradients (Remark 3) and Hessians (Remark 8) introduces variance that may affect practical convergence; experiments lack variance reporting. Second, achieving tight (1-e^{-α}) approximation requires non-oblivious auxiliary functions (Theorem 4), adding overhead versus standard gradient ascent. Third, Multinoulli-OSCG requires maintaining Q=O(√T) online linear oracles (Algorithm 3), creating memory overhead, though OSGA mitigates this. Finally, the 'parameter-free' claim is weakened for Multinoulli-OSGA with auxiliary estimation, which requires knowing α, β, γ to set weight function w(z) (Table II footnote †), undermining the advantage in online settings.
Tables I and II provide comprehensive comparisons positioning Multinoulli-SCG favorably against distorted local-search in query complexity. However, the paper lacks comparison with other recent continuous relaxations for non-submodular functions. Experiments on video summarization, coverage, and Bayesian A-optimal design (Section VII) demonstrate competitive utility with fewer queries, but offer limited statistical analysis (no confidence intervals) and omit hyperparameter sensitivity analysis (batch size L, step size η). The tightness claim for stationary points (Remark 7) is supported by a constructed example in Appendix I showing 1/2-approximation is tight for submodular cases.
Detailed algorithmic descriptions and proofs are provided in appendices, aiding reproducibility. However, no code repository is mentioned. Stochastic gradient and Hessian estimation require careful sampling implementation (Remarks 3, 8), and projections use CVX (Appendix A) which may not scale. The offline algorithms require T=O(√(nr)/ϵ) iterations with batch size L=O(T), implying substantial computational cost per iteration despite reduced total queries. Critical hyperparameters (step sizes, batch sizes) are specified in Appendix A, but sensitivity to these choices is not analyzed in the main experiments.
Identifying the most representative subset for a close-to-submodular objective while satisfying the predefined partition constraint is a fundamental task with numerous applications in machine learning. However, the existing distorted local-search methods are often hindered by their prohibitive query complexities and the rigid requirement for prior knowledge of difficult-to-obtain structural parameters. To overcome these limitations, we introduce a novel algorithm titled Multinoulli-SCG, which not only is parameter-free, but also can achieve the same approximation guarantees as the distorted local-search methods with significantly fewer function evaluations. More specifically, when the objective function is monotone $\alpha$-weakly DR-submodular or $(\gamma,\beta)$-weakly submodular, our Multinoulli-SCG algorithm can attain a value of $(1-e^{-\alpha})\text{OPT}-\epsilon$ or $(\frac{\gamma^{2}(1-e^{-(\beta(1-\gamma)+\gamma^2)})}{\beta(1-\gamma)+\gamma^2})\text{OPT}-\epsilon$ with only $O(1/\epsilon^{2})$ function evaluations, where OPT denotes the optimal value. The cornerstone of our Multinoulli-SCG algorithm is an innovative continuous-relaxation framework named Multinoulli Extension(ME), which can effectively convert the discrete subset selection problem subject to partition constraints into a solvable continuous maximization focused on learning the optimal multinoulli priors across the concerned partition. In sharp contrast with the well-established multi-linear extension for submodular subset selection, a notable advantage of our proposed ME is its intrinsic capacity to provide a lossless rounding scheme for any set function. Furthermore, based on our proposed ME, we also present two novel online algorithms, namely, Multinoulli-OSCG and Multinoulli-OSGA, for the unexplored online subset selection problems over partition constraints.
Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.
No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.