Publications and Preprints
Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
Chao Ma, Lexing Ying, arXiv: 2210.06741, PDFThe Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli, arXiv: 2210.03820, PDFCorrecting Convexity Bias in Function and Functional Estimate
Chao Ma, Lexing Ying, arXiv: 2208.07996, PDFEarly Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
Mingze Wang, Chao Ma, arXiv: 2206.02139, PDFGeneralization Error Bounds for Deep Neural Networks Trained by SGD
Mingze Wang, Chao Ma, arXiv: 2206.03299, PDFBeyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscape
Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying, Journal of Machine Learning, PDFProvably Convergent Quasistatic Dynamics for Mean-Field Two-Player Zero-Sum Games
Chao Ma, Lexing Ying, ICLR 2022, PDFA Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization
Chao Ma, Lexing Ying, Research in the Mathematical Sciences volume 9, Article number: 47 (2022), PDFOn Linear Stability of SGD and Input-Smoothness of Neural Networks
Chao Ma, Lexing Ying, Neurips 2021, PDFNonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks
Yuqing Li, Tao Luo, Chao Ma, SIAM Journal on Mathematics of Data Science 4 (2), 694-7205, PDFAchieving Adversarial Robustness Requires An Active Teacher
Chao Ma, Lexing Ying, Journal of Computational Mathematics, Vol.39, No.6, 2021, 880–896., PDFTowards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven HOI, Neurips 2020, PDFTowards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
Weinan E, Chao Ma, Stephan Wojtowytsch, and Lei Wu, CSIAM Trans. Appl. Math. Vol. 1, No. 4, pp. 561-615, PDFComplexity Measures for Neural Networks with General Activation Functions Using Path-based Norms
Zhong Li, Chao Ma, Lei Wu, arXiv:2009.06132. PDFA Qualitative Study of the Dynamic Behavior of Adaptive Gradient Algorithms
Chao Ma, Lei Wu, Weinan E, Mathematical and Scientific Machine Learning 2021, 1-22. PDFThe Slow Deterioration of the Generalization Error of the Random Feature Model
Chao Ma, Lei Wu, Weinan E, Mathematical and Scientific Machine Learning 2020, 373-389. PDFThe Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models
Chao Ma, Lei Wu, Weinan E, arXiv:2006.14450. PDFA Priori Estimates of the Generalization Error for Autoencoders
Zehao Dou, Weinan E, Chao Ma, ICASSP 2020, 3327-3331. LinkA mean-field analysis of deep resnet and beyond: Towards provable optimization via overparameterization from depth
Yuping Lu, Chao Ma, Yulong LU, Jianfeng Lu, Lexing Ying, ICML 2020. PDFOn the generalization properties of minimum-norm solutions for over-parameterized neural network models
Weinan E, Chao Ma, Lei Wu, arXiv:1912.06987. PDFModeling subgrid-scale force and divergence of heat flux of compressible isotropic turbulence by artificial neural network
Chenyue Xie, Ke Li, Chao Ma, Jianchun Wang, Physical Review Fluids 4(10), 104605. LinkHeterogeneous Multireference Alignment for Images With Application to 2D Classification in Single Particle Reconstruction
Chao Ma, Tamir Bendory, Nicolas Boumal, Fred Sigworth, Amit Singer, IEEE Transactions on Image Processing 29, 1699-1710. PDFUniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations
Jiequn Han, Chao Ma, Zheng Ma, Weinan E, Proceedings of the National Academy of Sciences (2019): 201909854. PDFBarron Spaces and the Flow-induced Function Spaces for Neural Network Models
Weinan E, Chao Ma, Lei Wu, Constructive Approximation. PDFArtificial neural network approach to large-eddy simulation of compressible isotropic turbulence
Chenyue Xie, Jianchun Wang, Ke Li, Chao Ma, Phys. Rev. E 99, 053113, LinkA priori estimates of the population risk for residual networks
Weinan E, Chao Ma, Qingcan Wang, Communications in Mathematical Sciences, PDFAnalysis of the gradient descent algorithm for a deep neural network model with skip-connections
Weinan E, Chao Ma, Qingcan Wang, Lei Wu, arXiv:1904.05263, PDFA comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics
Weinan E, Chao Ma, Lei Wu, Science China Mathematics 63 (7): 1235–1258, PDFMachine learning from a continuous viewpoint, I
Weinan E, Chao Ma, Lei Wu, Science China Mathematics (2020): 1-34, PDFA priori estimates of the population risk for two-layer neural networks
Weinan E, Chao Ma, Lei Wu, Communications in Mathematical Sciences 17 (5), 1407-1425, PDFGlobal convergence of gradient descent for deep linear residual networks
Lei Wu, Qingcan Wang, Chao Ma, Neurips 2019, PDFGlobally Convergent Levenberg-Marquardt Method For Phase Retrieval
Chao Ma, Xin Liu, Zaiwen Wen, IEEE Transactions on Information Theory 65 (4), 2343-2359, LinkModel Reduction with Memory and the Machine Learning of Dynamical Systems
Chao Ma, Jianchun Wang, Weinan E, Commun. Comput. Phys., 25 (2019), pp. 947-9628, PDFHow SGD Selects the Global Minima in Over-parameterized Learning: A Stability Perspective
Lei Wu, Chao Ma, Weinan E, Neurips 2018, PDFBispectrum Inversion with Application to Multireference Alignment
Tamir Bendory, Nicolas, Boumal, Chao Ma, Zhizhen Zhao, Amit Singer, IEEE Transactions on Signal Processing 66.4 (2017): 1037-1050, PDF