Document Type : Research Article
Authors
1 Department of Industrial Mathematics, Admiralty University of Nigeria, Ibusa, Delta State, Nigeria.
2 Department of Mathematics, Delta State University, Abraka, Delta State, Nigeria.
Abstract
Hyperparameter optimization (HPO) is essential for maximizing the performance of deep learning models. Traditional approaches, such as grid search and Bayesian Optimization (BO), are widely used but can be computationally expensive. We present Interpolation-Based Optimization (IBO), a novel framework that employs piecewise polynomial interpolation to estimate optimal hyperparameters from sparse evaluations efficiently. IBO achieves substantial computational savings by constructing deterministic interpolants with linear per-iteration complexity of O(n.d^3), in contrast to the cubic O(n^3) cost associated with BO. Empirical studies on the MNIST dataset show that IBO attains 98.0% accuracy with a 39% reduction in runtime (12 iterations vs. 18) and no statistically significant difference from BO, p = 0.12. In higher-dimensional, lower-cost settings, such as ResNet-18 on CIFAR-10, performance degrades, highlighting a trade-off between dimensionality and efficiency. More generally, IBO is well-suited for resource-constrained settings due to its simplicity, determinism, and computational efficiency. Future work will explore hybrid methods to address scalability problems and extend IBO to more complex modeling architectures, such as transformers.
Highlights
- Introduces Interpolation-Based Optimization (IBO), a novel hyperparameter tuning framework that uses piecewise polynomial interpolation to construct deterministic surrogates from sparse evaluations, eliminating the need for stochastic sampling or probabilistic modeling.
- Provides formal convergence guarantees and surrogate error analysis. Specifically, under Lipschitz continuity assumptions, IBO offers bounded approximation error for the interpolated surrogate and demonstrates favorable convergence properties with deterministic computation.
- Delivers substantial computational savings. IBO achieves linear per-iteration complexity on the order of O(n·d³) (where n is the number of evaluations and d is the number of hyperparameters), in contrast to the cubic O(n³) cost often associated with Bayesian Optimization (BO), enabling faster practical tuning.
- Demonstrates practical performance gains on standard benchmarks. On MNIST, IBO attains 98.0% accuracy with a 39% reduction in tuning iterations (12 vs. 18) while showing no statistically significant difference from BO (e.g., p = 0.12).
- Shows scalable behavior that depends on dimensionality. While effective in low- to moderate-dimensional settings, IBO exhibits performance degradation in higher-dimensional spaces (e.g., d ≥ 10) under sparse sampling, highlighting a trade-off between dimensionality and efficiency.
- Integrates robust smoothing and adaptive sampling strategies. The method incorporates smoothing splines, outlier suppression, and adaptive sampling thresholds to improve resilience to noise and discontinuities in the objective landscape.
- Validates across a spectrum of architectures. IBO delivers competitive validation and test accuracies on both simple (CNN on MNIST) and more complex (ResNet-18 on CIFAR-10) models, achieving 92.3% test accuracy on CIFAR-10—comparable to the BO (92.5%) with reduced tuning cost.
- Suitable for resource-constrained environments. With its simplicity, determinism, and low overhead, IBO is well-suited for edge computing, embedded systems, and other settings where computational resources are limited.
- Lays groundwork for future enhancements. The results motivate hybridization with established methods (e.g., combining IBO with BOHB) and exploration of adaptive sampling, random embeddings, and integration with more complex models (e.g., Transformers) to extend scalability and robustness to noisy or multi-objective optimization tasks.
Keywords
- Deep learning
- Hyperparameter optimization
- Interpolation-based optimization
- Polynomial interpolation
- Efficient model tuning
Main Subjects
[2] Barthelmann, V., Novak, E., Ritter, K. (2000). “High dimensional polynomial interpolation on sparse grids”. Advances in Computational Mathematics, 12(4), 273-288, doi:https://doi.org/10.1023/A:1018977404843.
[12] Liu, H., Ong, Y.-S., & Cai, J. (2021). “Large-scale heteroscedastic regression via Gaussian process”. IEEE Transactions on Neural Networks and Learning Systems, 32(2), 708-721, doi:https://doi.org/10.1109/TNNLS.2020.2979188.