Submitted:
04 August 2024
Posted:
05 August 2024
You are already at the latest version
Abstract
Keywords:
MSC: 41A25
1. Introduction
2. Some Properties of the Translation Network on the Unit Sphere
2.1. Density
2.2. Reproducing Property
3. Convergence Rate of the Kernel Regularized Regression
3.1. Learning Framework
3.2. Error Decomposition
3.3. The Convergence Rate
3.4. Conclusion and Discussion
4. Some Lemmas
5. Proof for Theorems and Propositions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Communications of the ACM., 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Wu, Y.; Schuster, M.; Chen, Z.; Le, Q.V.; Norouzi, M.; Macherey, W.; Cao, Y.; Gao, Q. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv 2016, arXiv:1609.08144v2 [cs.CL]. [Google Scholar]
- Alipanahi, B.; Delong, A.; Weirauch, M.T.; Frey, B.J. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature Biotechnology. 2015, 33, 831–838. [Google Scholar] [CrossRef] [PubMed]
- Chui, C.K.; Lin, S.B.; Zhou, D. X. Construction of neural networks for realization of localized deep learning. arXiv 2018, arXiv:1803.03503 [cs.LG]. [Google Scholar] [CrossRef]
- Chui, C. K.; Lin, S.B.; Zhou, D.X. Deep neural networks for rotation-invariance approximation and learning. Anal. and Appl. 2019, 17, 737–772. [Google Scholar] [CrossRef]
- Fang, Z.Y.; Feng, H.; Huang, S.; Zhou, D.X. Theory of deep convolutional neural networks II: spherical analysis. Neural Networks. 2020, 131, 154–162. [Google Scholar] [CrossRef] [PubMed]
- Feng, H.; Huang, S.; Zhou, D.X. Generalization analysis of CNNs for classification on spheres. IEEE Transactions on Neural Networks and Learning Systems. [CrossRef] [PubMed]
- Zhou, D. X. Deep distributed convolutional neural networks: universality. Anal. Appl. 2018, (16), 895–919. [Google Scholar] [CrossRef]
- Zhou, D. X. Universality of deep convolutional neural networks. Appl. Comput. Harmon. Anal. 2020, 48, 787–794. [Google Scholar] [CrossRef]
- Cucker, F.; Smale, S. On the mathematical foundations of learning. Bull.Amer.Math.Soc. 2001, 39, 1–49. [Google Scholar] [CrossRef]
- An, C.P.; Chen, X.J.; Sloan, I.H.; Womersley, R.S. Regulatized least squares approximations on the sphere using spherical designs. SIAM J.Numer.Anal. 2012, 50, 1513–1534. [Google Scholar] [CrossRef]
- An, C.P.; Wu, H.N. Lasso hyperinterpolation over general regions. SIAM J. Sci. Comput. 2021, 43, A3967–A3991. [Google Scholar] [CrossRef]
- An, C.P.; Ran, J.S. Hard thresholding hyperinterpolation over general regions. arXiv 2023, arXiv:2209.14634v2 [math.NA]. [Google Scholar]
- De Mol, C.; De Vito, E.; Rosasco, L. Elastic-net regularization in learning theory. J. Complexity, 2009, 25, 201–230. [Google Scholar] [CrossRef]
- Fischer, S.; Steinwart, I. Sobolev norm learning rates for regularized least-squares algorithms. J.Mach.Learn.Res. 2020, 21, 8464–8501. [Google Scholar]
- Lai, J.F.; Li, Z.F.; Huang, D.G.; Lin, Q. The optimality of kernel classifiers in Sobolev space. arXiv 2024, arXiv:2402.01148v1 [math.ST]. [Google Scholar]
- Sun, H.W.; Wu, Q. Least square regression with indefinite kernels and coefficient regularization. Appl. Comput. Harmon. Anal. 2011, 30, 96–109. [Google Scholar] [CrossRef]
- Wu, Q.; Zhou, D.X. Learning with sample dependent hypothesis spaces. Comput. Math. Appl. 2008, 56, 2896–2907. [Google Scholar] [CrossRef]
- Chen, H.; Wu, J.T.; Chen, D.R. Semi-supervised learning for regression based on the diffusion matrix (in Chinese). Sci Sin Math. 2014, 44, 399–408. [Google Scholar]
- Sun, X.J.; Sheng, B.H. The learning rate of kernel regularized regression associated with a correntropy-induced loss. Adv. in Math.(Beijing) 2024, 53, 633–652. [Google Scholar]
- Wu, Q.; Zhou, D.X. Analysis of support vector machine classification. J. Comput.Anal. Appl. 2006, 8, 99–119. [Google Scholar]
- Cucker, F.; Zhou, D. X. Learning Theory: An Approximation Theory Viewpoint. Cambridge University Press, New York, 2007.
- Sheng, B. H. Reproducing property of bounded linear operators and kernel regularized least square regressions. Int. J. Wavelets Multiresolution Inf. Process. 2024, 2450013. [Google Scholar] [CrossRef]
- Steinwart, I.; and Christmann, A. Support Vector Machines,Springer-Verlag, New York, 2008.
- Lin, S.B.; Wang, D.; Zhou, D.X. Sketching with spherical designs for noisy data fitting on spheres. SIAM J.Sci.Comput. 2024, 46, A313–A337. [Google Scholar] [CrossRef]
- Lin, S.B.; Zeng, J.S.; Zhang, X.Q. Constructive neural network learning. IEEE Trans. on Cybernetics. 2019, 49, 221–232. [Google Scholar] [CrossRef] [PubMed]
- Mhaskar, H.N.; Micchelli, C.A. Degree of approximation by neural and translation networks with single hidden layer. Adv. Appl. Math. 1995, 16, 151–183. [Google Scholar] [CrossRef]
- Sheng, B.H.; Zhou, S.P.; Li, H.T. On approximation by tramslation networks in Lp(Rk) spaces. Adv. in Math.(Beijing) 2007, 36, 29–38. [Google Scholar]
- Mhaskar, H. N.; Narcowich, F. J.; Ward, J. D. Approximation properties of zonal function networks using scattered data on the sphere. Adv. Comput. Math. 1999, 11, 121–137. [Google Scholar] [CrossRef]
- Sheng, B. H. On approximation by reproducing kernel spaces in weighted Lp-spaces. J. Syst. Sci. Complex. 2007, 20(4), 623–638. [Google Scholar] [CrossRef]
- Narcowich, F.J.; Ward, J.D.; Wendland, H. Sobolev error estimates and a Bernstein inequality for scattered data interpolation via radial basis functions. Constr.Approx. 2006, 24, 175–186. [Google Scholar] [CrossRef]
- Narcowich, F.J.; Ward, J.D. Scattered data interpolation on spheres: error estimates and locally supported basis functions. SIAM J.Math.Anal. 2002, 33, 1393–1410. [Google Scholar] [CrossRef]
- Narcowich, F.J.; Sun, X.P.; Ward, J.D.; Wendland, H. Direct and inverse Sobolev error estimates for scattered data interpolation via spherical basis functions. Found.Comput.Math. 2007, 7, 369–390. [Google Scholar] [CrossRef]
- Gröchenig, K. Sampling, Marcinkiewicz-Zygmund inequalities, approximation and quadrature rules. J.Approx.Theory 2020, 257, 105455. [Google Scholar] [CrossRef]
- Mhaskar, H.N.; Narcowich, F.J.; Sivakumar, N.; Ward, J.D. Approximation with interpolatory constraints. Proc.Amer.Math. Soc. 2001, 130, 1355–1364. [Google Scholar] [CrossRef]
- Marzo, J. Marcinkiewicz-Zygmund inequalities and interpolation by spherical harmonics. J. Funct. Anal. 2007, 250, 559–587. [Google Scholar] [CrossRef]
- Marzo, J.; Pridhnani, B. Sufficiant conditions for sampling and interpolation on the sphere. Constr. Approx. 2014, 40, 241–257. [Google Scholar] [CrossRef]
- Wang, H.P. Marcinkiewicz-Zygmund inequalities and interpolation by spherical polynomials with respect to doubling weights. J.Math.Anal.Appl. 2015, 423, 1630–1649. [Google Scholar] [CrossRef]
- Gia, T.L.; Slon, I.H. The nuiform norm of hyperinterpolation on the unit sphere in an arbitrary number of dimensions em Constr. Approx. 2001, 17, 249–265. [Google Scholar] [CrossRef]
- Sloan, I.H. Polynomial interpolation and hyperinterpolation over general regions. J.Approx.Theory 1995, 83, 238–254. [Google Scholar] [CrossRef]
- Sloan, I.H.; Womersley, R.S. Constructive polynomial approximation on the sphere. J.Approx. Theory 2000, 103, 91–118. [Google Scholar] [CrossRef]
- Wang, H.P. Optimal lower estimates for the worst case cubature error and the approximation by hyperinterpolation operators in the Sobolev space sertting on the sphere. Int. J. Wavelets Multiresolution Inf. Process. 2009, 7, 813–823. [Google Scholar] [CrossRef]
- Wang, H.P.; Wang, K.; Wang, X.L. On the norm of the hyperinterpolation operator on the d-dimensional cube. Comput. Appl. 2014, 68, 632–638. [Google Scholar]
- Sloan, I.H.; Womersley, R.S. Filtered hyperinterpolation: a constructive polynomial approximation on the sphere. Int.J.Geomath. 2012, 3, 95–117. [Google Scholar] [CrossRef]
- Bondarenko, A.; Radchenko, D.; Viazovska, M. Well-seperated spherical designs. Constr. Approx. 2015, 41, 93–112. [Google Scholar] [CrossRef]
- Hesse, K.; Womersley, R.S. Numerical integration with polynomial exactness over a spherical cap. Adv.Math.Math. 2012, 36, 451–483. [Google Scholar] [CrossRef]
- Delsarte, P.; Goethals, J.M.; Seidel, J.J. Spherical codes and designs. Geom.Dedicata. 1977, 6, 363–388. [Google Scholar] [CrossRef]
- An, C.P.; Chen, X.J.; Sloan, I.H.; Womersley, R.S. Well conditioned spherical designs for integration and interpolation on the two-sphere. SIAM J. Numer.Anal. 2010, 48, 2135–2157. [Google Scholar] [CrossRef]
- Chen, X.; Frommer, A.; Lang, B. Computational existence proof for spherical t-designs. Numer. Math. 2010, 117, 289–305. [Google Scholar] [CrossRef]
- An, C.P.; Wu, H.N. Bypassing the quadrature exactness assumption of hyperinterpolation on the sphere. J.Complexity. 2024, 80, 101789. [Google Scholar] [CrossRef]
- An, C.P.; Wu, H.N. On the quadrature exactness in hyperinterpolation. BIT Numer. Math. 2022, 62, 1899–1919. [Google Scholar] [CrossRef]
- Sheng, B.H. The covering numbers for some periodic reproducing kernel spaces Acta Math. Scientia 2009, 29A(6), 1590–1600. [Google Scholar]
- Sheng, B.H. Estimate the norm of the Mercer kernel matrices with discrete orthogonal transforms. Acta Math. Hungar. 2009, 122, 339–355. [Google Scholar] [CrossRef]
- Sheng, B.H.; Wang, J.L.; Li, P. On the covering number of some Mercer kernel Hilbert spaces. J. Complexity 2008, 24, 241–258. [Google Scholar]
- Zhang, C.P.; Sheng, B.H.; Chen, Z.X. Applications of Bernstein -Durrmeyer operators in estimating the norm of Mercer kernel matrices Anal. Theory and Appl. 2008, 24, 74–86. [Google Scholar]
- Sun, X. J.; Sheng, B. H.; Liu, L.; Pan, X. L. On the density of translation networks defined on the unit ball. Math. Found. Comput. 2023. [Google Scholar] [CrossRef]
- Parhi, R.; Nowak, R.D. Banach space representer theorems for neural networks and ridge splines. J. Mach. Learn Res. 2021, 22, 1–40. [Google Scholar]
- Oono, K.; Suzuki, Y.J. Approximation and non-parameteric estimate of ResNet-type convolutional neural networks. arXiv 2023, arXiv:1903.10047v4 [stat.ML]. [Google Scholar]
- Shen, G.H.; Jiao, Y.L.; Lin, Y.Y.; Huang, J. Non-asymptotic excess risk bounds for classification with deep convolutional neural networks. arXiv 2021, arXiv:2105.00292 [cs.LG]. [Google Scholar]
- Mallat, S. Understanding deep convolutional networks. Phil. Trans. R. Soc. 2016, 374A, 20150203. [Google Scholar] [CrossRef]
- Mhaskar, H.N.; Narcowich, F.J.; Ward, J.D. Spherical Marcinkiewicz-Zygmund inequalities and positive quadratue. Math.Comput. 2001, 70, 1113–1130, Corrigendum: Math.Comp. 2001, (71), 453-454.). [Google Scholar] [CrossRef]
- Wang, H.P.; Wang, K. Optimal recovery of Besov classes of generalized smoothness and Sobolev class on the sphere. J. Complexity 2016, 32, 40–52. [Google Scholar] [CrossRef]
- Dai, F.; Xu, Y. Approximation Theory and Harmonic Analysis on Spheres and Balls. Springer, New York, 2013.
- Müller, C. Spherical Harmonic, Springer-Verlag, Berlin, 1966.
- Cheney, W.; Light, W. A Course in Approximation Theory, China Machine Press, Beijing, 2004.
- Brown, G.; Dai, F. Approximation of smooth functions on compact two-point homogeneous spaces. J. Funct. Anal. 2005, 220, 401–422. [Google Scholar] [CrossRef]
- Dai, F.; Wang, H.P. Positive cubature formulas and Marcinkiewicz-Zygmund inequalities on spherical caps. Constr. Approx. 2010, 31, 1–36. [Google Scholar] [CrossRef]
- Dai, F. On generalized hyperinterpolation on the sphere. Proc. Amer. Math. Soc. 2006, 134, 2931–2941. [Google Scholar] [CrossRef]
- Sheng, B. H.; Wang, J. L. Moduli of smoothness, K-functionals and Jackson-type inequalities associated with kernel function approximation in learning theory. Anal. Appl. 2024. [Google Scholar] [CrossRef]
- Bauschke, H. H.; Combettes, P. L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer, New York, 2010.
- Sheng, B.H.; Xiang, D.H. The convergence rate for a K-functional in learning theory. J. Inequal. Appl. 2010, 249507. [Google Scholar] [CrossRef]
- Lin, S.B.; Wang, Y.G.; Zhou, D.X. Distributed filtered hyperinterpolation for noisy data on the sphere. SIAM J. Numer. Anal. 2021, 59, 634–659. [Google Scholar] [CrossRef]
- Montúfar, G.; Wang, Y.G. Distributed learning via filtered hyperinterpolation on manifolds. Found.Comput.Math. 2022, 22, 1219–1271. [Google Scholar] [CrossRef]
- Smale, S.; Zhou, D.X. Learning theory estimates via integral operators and their applications. Constr. Approx. 2007, 26, 153–172. [Google Scholar] [CrossRef]
- Aronszajn, N. Theory of reproducing kernels. Trans. Amer. Math. Soc. 1950, 68, 337–404. [Google Scholar] [CrossRef]
- Kyriazis, G.; Petrushev, P.; Xu, Y. Jacobi decomposition of weighted Triebel-Lizorkin and Besov spaces. arXiv 2006, arXiv:math/0610624 [math.CA]. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).