Optimizing Deep Learning Hyperparameters Using Interpolation-Based Optimization

Ayansiji, Michael Oluwaseun; Okwonu, Friday Zinzendoff

doi:10.30473/coam.2025.74381.1304

Document Type : Research Article

Authors

¹ Department of Industrial Mathematics, Admiralty University of Nigeria, Ibusa, Delta State, Nigeria.

² Department of Mathematics, Delta State University, Abraka, Delta State, Nigeria.

https://doi.org/10.30473/coam.2025.74381.1304

Abstract

Hyperparameter optimization (HPO) is essential for maximizing the performance of deep learning models‎. ‎Traditional approaches, such as grid search and Bayesian Optimization (BO), are widely used but can be computationally expensive. ‎ We present Interpolation-Based Optimization (IBO), a novel framework that employs piecewise polynomial interpolation to estimate optimal hyperparameters from sparse evaluations efficiently‎. ‎IBO achieves substantial computational savings by constructing deterministic interpolants with linear per-iteration complexity of O(n.d^3)‎, ‎in contrast to the cubic O(n^3) cost associated with BO‎. ‎Empirical studies on the MNIST dataset show that IBO attains 98.0% accuracy with a 39% reduction in runtime (12 iterations vs. 18) and no statistically significant difference from BO, ‎p = 0.12. In higher-dimensional, lower-cost settings‎, ‎such as ResNet-18 on CIFAR-10‎, performance degrades, highlighting a trade-off between dimensionality and efficiency. ‎More generally‎, ‎IBO is well-suited for resource-constrained settings due to its simplicity, determinism, and computational efficiency. ‎Future work will explore hybrid methods to address scalability problems‎ ‎and extend IBO to more complex modeling architectures‎, ‎such as transformers‎.

Highlights

Introduces Interpolation-Based Optimization (IBO), a novel hyperparameter tuning framework that uses piecewise polynomial interpolation to construct deterministic surrogates from sparse evaluations, eliminating the need for stochastic sampling or probabilistic modeling.
Provides formal convergence guarantees and surrogate error analysis. Specifically, under Lipschitz continuity assumptions, IBO offers bounded approximation error for the interpolated surrogate and demonstrates favorable convergence properties with deterministic computation.
Delivers substantial computational savings. IBO achieves linear per-iteration complexity on the order of O(n·d³) (where n is the number of evaluations and d is the number of hyperparameters), in contrast to the cubic O(n³) cost often associated with Bayesian Optimization (BO), enabling faster practical tuning.
Demonstrates practical performance gains on standard benchmarks. On MNIST, IBO attains 98.0% accuracy with a 39% reduction in tuning iterations (12 vs. 18) while showing no statistically significant difference from BO (e.g., p = 0.12).
Shows scalable behavior that depends on dimensionality. While effective in low- to moderate-dimensional settings, IBO exhibits performance degradation in higher-dimensional spaces (e.g., d ≥ 10) under sparse sampling, highlighting a trade-off between dimensionality and efficiency.
Integrates robust smoothing and adaptive sampling strategies. The method incorporates smoothing splines, outlier suppression, and adaptive sampling thresholds to improve resilience to noise and discontinuities in the objective landscape.
Validates across a spectrum of architectures. IBO delivers competitive validation and test accuracies on both simple (CNN on MNIST) and more complex (ResNet-18 on CIFAR-10) models, achieving 92.3% test accuracy on CIFAR-10—comparable to the BO (92.5%) with reduced tuning cost.
Suitable for resource-constrained environments. With its simplicity, determinism, and low overhead, IBO is well-suited for edge computing, embedded systems, and other settings where computational resources are limited.
Lays groundwork for future enhancements. The results motivate hybridization with established methods (e.g., combining IBO with BOHB) and exploration of adaptive sampling, random embeddings, and integration with more complex models (e.g., Transformers) to extend scalability and robustness to noisy or multi-objective optimization tasks.

Keywords

Main Subjects

Optimization

References

[1] Amini, A., Dolatshahi, M., Kerachian, R. (2023). “Effects of automatic hyperparameter tuning on the performance of multi-variate deep learning-based rainfall nowcasting”. Water Resources Research, 59(6), doi:https://doi.org/10.1029/2022WR032789.
[2] Barthelmann, V., Novak, E., Ritter, K. (2000). “High dimensional polynomial interpolation on sparse grids”. Advances in Computational Mathematics, 12(4), 273-288, doi:https://doi.org/10.1023/A:1018977404843.

[3] Bergstra, J., Bengio, Y. (2012). “Random search for hyper-parameter optimization”. Journal of Machine Learning Research, 13, 281-305.

[4] Bull, A.D. (2011). “Convergence rates of efficient global optimization algorithms”. Journal of Machine Learning Research, 12(88), 2879-2904, doi:https://doi.org/10.48550/arXiv.1101.3501.

[5] Ciresan, D.C., Meier, U., Schmidhuber, J. (2012). “Multi-column deep neural networks for classifying images”. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3642-3649, doi:https://doi.org/10.1109/CVPR.2012.6248110.

[6] Conn, A.R., Scheinberg, K., Vicente, L.N. (2009). “Introduction to derivative-free optimization”. Society for Industrial and Applied Mathematics (SIAM), doi:https://epubs.siam.org/doi/book/10.1137/1.9780898718768.

[7] Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., Hutter, F. (2019). “Autosklearn: Efficient and robust automated machine learning”. In Automated Machine Learning: Springer Series on Challenges in Machine Learning (pp. 113-134). Springer, doi:https://doi.org/10.1007/978-3-030-05318-5_6.

[8] Hong, H. (2024). “Landslide susceptibility assessment using locally weighted learning integrated with machine learning algorithms”. Expert Systems with Applications, 237(Part C), doi:https://doi.org/10.1016/j.eswa.2023.124678.

[9] Hutter, F., Hoos, H.H., Leyton-Brown, K. (2011). “Sequential model-based optimization for general algorithm configuration”. In Learning and Intelligent Optimization (LION 2011), 6683, 507-523, doi:https://doi.org/10.1007/978-3-642-25566-3_40.

[10] Ilemobayo, J.A., Durodola, O., Alade, O., Awotunde, O.J., Olanrewaju, A.T., Falana, O., Ogungbire, A., Osinuga, A., Ogunbiyi, D., Ifeanyi, A., Odezuligbo, I.E., Edu, O.E. (2024). “Hyperparameter tuning in machine learning: A comprehensive review”. Journal of Engineering Research and Reports, 26(6), 388-395, doi:https://doi.org/10.9734/jerr/2024/v26161188.

[11] Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A. (2018). ”Hyperband: A novel bandit-based approach to hyperparameter optimization”. Journal of Machine Learning Research, 18(185), 1-52, doi:https://doi.org/10.48550/arXiv.1603.06560.
[12] Liu, H., Ong, Y.-S., & Cai, J. (2021). “Large-scale heteroscedastic regression via Gaussian process”. IEEE Transactions on Neural Networks and Learning Systems, 32(2), 708-721, doi:https://doi.org/10.1109/TNNLS.2020.2979188.

[13] Maclaurin, D., Duvenaud, D., Adams, R.P. (2015). “Gradient-based hyperparameter optimization through reversible learning”. Proceedings of the 32nd International Conference on Machine Learning (ICML), 37, 2113-2122, http://proceedings.mlr.press/v37/maclaurin15.html

[14] Mirjalili, S., Gandomi, A.H., Mirjalili, S.Z., Saremi, S., Faris, H., Mirjalili, S.M. (2017). “Salp swarm algorithm: A bio-inspired optimizer for engineering design problems”. Advances in Engineering Software, 114, 163-191, doi:https://doi.org/10.1016/j.advengsoft.2017.07.002.

[15] Roohi, M., Mirzajani, S., Haghighi, A.R., Basse-O’Connor, A. (2024). ”Robust stabilization of fractional-order hybrid optical system using a single-input TS-fuzzy sliding mode control strategy with input nonlinearities”. AIMS Mathematics, 9(9), 25879-25907, doi:https://doi.org/10.3934/math.20241264.

[16] Tiep, N.H., Jeong, H.-Y., Kim, K.-D., Xuan Mung, N., Dao, N.-N., Tran, H.-N., Hoang, V.-K., Ngoc Anh, N., Vu, M.T. (2024). “A new hyperparameter tuning framework for regression tasks in deep neural network: Combined-sampling algorithm to search the optimized hyperparameters”. Mathematics, 12(24), 3892, doi:https://doi.org/10.3390/math12243892.

[17] Wahba, G. (1990). “Spline models for observational data”. Society for Industrial and Applied Mathematics, doi:https://doi.org/10.1137/1.9781611970128.

[18] Yin, L., Liu, J., Fang, Y., Gao, M., Li, M., Zhou, F. (2023). “Two-stage hybrid genetic algorithm for robot cloud service selection”. Journal of Cloud Computing, 12, 95, doi:https://doi.org/10.1186/s13677-023-00458-y.

[19] Zhou, J., Wang, P. (2023). “Retraction note: Image simulation of urban landscape in coastal areas based on geographic information system and machine learning”. Neural Computing and Applications, 35, 3577, doi:https://doi.org/10.1007/s00521-022-08145-w.

Control and Optimization in Applied Mathematics

Optimizing Deep Learning Hyperparameters Using Interpolation-Based Optimization

References

References

Volume 10, Issue 2 - Serial Number 20
Open Access Policy
Summer-Autumn (2025)
July 2025
Pages 241-253

Optimizing Deep Learning Hyperparameters Using Interpolation-Based Optimization

References

References

Volume 10, Issue 2 - Serial Number 20 Open Access PolicySummer-Autumn (2025)July 2025Pages 241-253

Volume 10, Issue 2 - Serial Number 20
Open Access Policy
Summer-Autumn (2025)
July 2025
Pages 241-253