Publications and Preprints

  1. Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
    Chao Ma, Lexing Ying, arXiv: 2210.06741, PDF

  2. The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
    Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli, arXiv: 2210.03820, PDF

  3. Correcting Convexity Bias in Function and Functional Estimate
    Chao Ma, Lexing Ying, arXiv: 2208.07996, PDF

  4. Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
    Mingze Wang, Chao Ma, arXiv: 2206.02139, PDF

  5. Generalization Error Bounds for Deep Neural Networks Trained by SGD
    Mingze Wang, Chao Ma, arXiv: 2206.03299, PDF

  6. Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscape
    Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying, Journal of Machine Learning, PDF

  7. Provably Convergent Quasistatic Dynamics for Mean-Field Two-Player Zero-Sum Games
    Chao Ma, Lexing Ying, ICLR 2022, PDF

  8. A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization
    Chao Ma, Lexing Ying, Research in the Mathematical Sciences volume 9, Article number: 47 (2022), PDF

  9. On Linear Stability of SGD and Input-Smoothness of Neural Networks
    Chao Ma, Lexing Ying, Neurips 2021, PDF

  10. Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks
    Yuqing Li, Tao Luo, Chao Ma, SIAM Journal on Mathematics of Data Science 4 (2), 694-7205, PDF

  11. Achieving Adversarial Robustness Requires An Active Teacher
    Chao Ma, Lexing Ying, Journal of Computational Mathematics, Vol.39, No.6, 2021, 880–896., PDF

  12. Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
    Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven HOI, Neurips 2020, PDF

  13. Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
    Weinan E, Chao Ma, Stephan Wojtowytsch, and Lei Wu, CSIAM Trans. Appl. Math. Vol. 1, No. 4, pp. 561-615, PDF

  14. Complexity Measures for Neural Networks with General Activation Functions Using Path-based Norms
    Zhong Li, Chao Ma, Lei Wu, arXiv:2009.06132. PDF

  15. A Qualitative Study of the Dynamic Behavior of Adaptive Gradient Algorithms
    Chao Ma, Lei Wu, Weinan E, Mathematical and Scientific Machine Learning 2021, 1-22. PDF

  16. The Slow Deterioration of the Generalization Error of the Random Feature Model
    Chao Ma, Lei Wu, Weinan E, Mathematical and Scientific Machine Learning 2020, 373-389. PDF

  17. The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models
    Chao Ma, Lei Wu, Weinan E, arXiv:2006.14450. PDF

  18. A Priori Estimates of the Generalization Error for Autoencoders
    Zehao Dou, Weinan E, Chao Ma, ICASSP 2020, 3327-3331. Link

  19. A mean-field analysis of deep resnet and beyond: Towards provable optimization via overparameterization from depth
    Yuping Lu, Chao Ma, Yulong LU, Jianfeng Lu, Lexing Ying, ICML 2020. PDF

  20. On the generalization properties of minimum-norm solutions for over-parameterized neural network models
    Weinan E, Chao Ma, Lei Wu, arXiv:1912.06987. PDF

  21. Modeling subgrid-scale force and divergence of heat flux of compressible isotropic turbulence by artificial neural network
    Chenyue Xie, Ke Li, Chao Ma, Jianchun Wang, Physical Review Fluids 4(10), 104605. Link

  22. Heterogeneous Multireference Alignment for Images With Application to 2D Classification in Single Particle Reconstruction
    Chao Ma, Tamir Bendory, Nicolas Boumal, Fred Sigworth, Amit Singer, IEEE Transactions on Image Processing 29, 1699-1710. PDF

  23. Uniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations
    Jiequn Han, Chao Ma, Zheng Ma, Weinan E, Proceedings of the National Academy of Sciences (2019): 201909854. PDF

  24. Barron Spaces and the Flow-induced Function Spaces for Neural Network Models
    Weinan E, Chao Ma, Lei Wu, Constructive Approximation. PDF

  25. Artificial neural network approach to large-eddy simulation of compressible isotropic turbulence
    Chenyue Xie, Jianchun Wang, Ke Li, Chao Ma, Phys. Rev. E 99, 053113, Link

  26. A priori estimates of the population risk for residual networks
    Weinan E, Chao Ma, Qingcan Wang, Communications in Mathematical Sciences, PDF

  27. Analysis of the gradient descent algorithm for a deep neural network model with skip-connections
    Weinan E, Chao Ma, Qingcan Wang, Lei Wu, arXiv:1904.05263, PDF

  28. A comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics
    Weinan E, Chao Ma, Lei Wu, Science China Mathematics 63 (7): 1235–1258, PDF

  29. Machine learning from a continuous viewpoint, I
    Weinan E, Chao Ma, Lei Wu, Science China Mathematics (2020): 1-34, PDF

  30. A priori estimates of the population risk for two-layer neural networks
    Weinan E, Chao Ma, Lei Wu, Communications in Mathematical Sciences 17 (5), 1407-1425, PDF

  31. Global convergence of gradient descent for deep linear residual networks
    Lei Wu, Qingcan Wang, Chao Ma, Neurips 2019, PDF

  32. Globally Convergent Levenberg-Marquardt Method For Phase Retrieval
    Chao Ma, Xin Liu, Zaiwen Wen, IEEE Transactions on Information Theory 65 (4), 2343-2359, Link

  33. Model Reduction with Memory and the Machine Learning of Dynamical Systems
    Chao Ma, Jianchun Wang, Weinan E, Commun. Comput. Phys., 25 (2019), pp. 947-9628, PDF

  34. How SGD Selects the Global Minima in Over-parameterized Learning: A Stability Perspective
    Lei Wu, Chao Ma, Weinan E, Neurips 2018, PDF

  35. Bispectrum Inversion with Application to Multireference Alignment
    Tamir Bendory, Nicolas, Boumal, Chao Ma, Zhizhen Zhao, Amit Singer, IEEE Transactions on Signal Processing 66.4 (2017): 1037-1050, PDF