Chao Ma - Publications

Publications and Preprints

Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
Chao Ma, Lexing Ying, arXiv: 2210.06741, PDF
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli, arXiv: 2210.03820, PDF
Correcting Convexity Bias in Function and Functional Estimate
Chao Ma, Lexing Ying, arXiv: 2208.07996, PDF
Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
Mingze Wang, Chao Ma, arXiv: 2206.02139, PDF
Generalization Error Bounds for Deep Neural Networks Trained by SGD
Mingze Wang, Chao Ma, arXiv: 2206.03299, PDF
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscape
Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying, Journal of Machine Learning, PDF
Provably Convergent Quasistatic Dynamics for Mean-Field Two-Player Zero-Sum Games
Chao Ma, Lexing Ying, ICLR 2022, PDF
A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization
Chao Ma, Lexing Ying, Research in the Mathematical Sciences volume 9, Article number: 47 (2022), PDF
On Linear Stability of SGD and Input-Smoothness of Neural Networks
Chao Ma, Lexing Ying, Neurips 2021, PDF
Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks
Yuqing Li, Tao Luo, Chao Ma, SIAM Journal on Mathematics of Data Science 4 (2), 694-7205, PDF
Achieving Adversarial Robustness Requires An Active Teacher
Chao Ma, Lexing Ying, Journal of Computational Mathematics, Vol.39, No.6, 2021, 880–896., PDF
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven HOI, Neurips 2020, PDF
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
Weinan E, Chao Ma, Stephan Wojtowytsch, and Lei Wu, CSIAM Trans. Appl. Math. Vol. 1, No. 4, pp. 561-615, PDF
Complexity Measures for Neural Networks with General Activation Functions Using Path-based Norms
Zhong Li, Chao Ma, Lei Wu, arXiv:2009.06132. PDF
A Qualitative Study of the Dynamic Behavior of Adaptive Gradient Algorithms
Chao Ma, Lei Wu, Weinan E, Mathematical and Scientific Machine Learning 2021, 1-22. PDF
The Slow Deterioration of the Generalization Error of the Random Feature Model
Chao Ma, Lei Wu, Weinan E, Mathematical and Scientific Machine Learning 2020, 373-389. PDF
The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models
Chao Ma, Lei Wu, Weinan E, arXiv:2006.14450. PDF
A Priori Estimates of the Generalization Error for Autoencoders
Zehao Dou, Weinan E, Chao Ma, ICASSP 2020, 3327-3331. Link
A mean-field analysis of deep resnet and beyond: Towards provable optimization via overparameterization from depth
Yuping Lu, Chao Ma, Yulong LU, Jianfeng Lu, Lexing Ying, ICML 2020. PDF
On the generalization properties of minimum-norm solutions for over-parameterized neural network models
Weinan E, Chao Ma, Lei Wu, arXiv:1912.06987. PDF
Modeling subgrid-scale force and divergence of heat flux of compressible isotropic turbulence by artificial neural network
Chenyue Xie, Ke Li, Chao Ma, Jianchun Wang, Physical Review Fluids 4(10), 104605. Link
Heterogeneous Multireference Alignment for Images With Application to 2D Classification in Single Particle Reconstruction
Chao Ma, Tamir Bendory, Nicolas Boumal, Fred Sigworth, Amit Singer, IEEE Transactions on Image Processing 29, 1699-1710. PDF
Uniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations
Jiequn Han, Chao Ma, Zheng Ma, Weinan E, Proceedings of the National Academy of Sciences (2019): 201909854. PDF
Barron Spaces and the Flow-induced Function Spaces for Neural Network Models
Weinan E, Chao Ma, Lei Wu, Constructive Approximation. PDF
Artificial neural network approach to large-eddy simulation of compressible isotropic turbulence
Chenyue Xie, Jianchun Wang, Ke Li, Chao Ma, Phys. Rev. E 99, 053113, Link
A priori estimates of the population risk for residual networks
Weinan E, Chao Ma, Qingcan Wang, Communications in Mathematical Sciences, PDF
Analysis of the gradient descent algorithm for a deep neural network model with skip-connections
Weinan E, Chao Ma, Qingcan Wang, Lei Wu, arXiv:1904.05263, PDF
A comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics
Weinan E, Chao Ma, Lei Wu, Science China Mathematics 63 (7): 1235–1258, PDF
Machine learning from a continuous viewpoint, I
Weinan E, Chao Ma, Lei Wu, Science China Mathematics (2020): 1-34, PDF
A priori estimates of the population risk for two-layer neural networks
Weinan E, Chao Ma, Lei Wu, Communications in Mathematical Sciences 17 (5), 1407-1425, PDF
Global convergence of gradient descent for deep linear residual networks
Lei Wu, Qingcan Wang, Chao Ma, Neurips 2019, PDF
Globally Convergent Levenberg-Marquardt Method For Phase Retrieval
Chao Ma, Xin Liu, Zaiwen Wen, IEEE Transactions on Information Theory 65 (4), 2343-2359, Link
Model Reduction with Memory and the Machine Learning of Dynamical Systems
Chao Ma, Jianchun Wang, Weinan E, Commun. Comput. Phys., 25 (2019), pp. 947-9628, PDF
How SGD Selects the Global Minima in Over-parameterized Learning: A Stability Perspective
Lei Wu, Chao Ma, Weinan E, Neurips 2018, PDF
Bispectrum Inversion with Application to Multireference Alignment
Tamir Bendory, Nicolas, Boumal, Chao Ma, Zhizhen Zhao, Amit Singer, IEEE Transactions on Signal Processing 66.4 (2017): 1037-1050, PDF