Chao Ma

I am currently a Szegö Assistant Professor in the Department of Mathematics at Stanford University. My research interests lie in the theory and application of machine learning. I am especially interested in theoretically understanding the optimization behavior of deep neural networks, e.g. the implicit bias of optimization algorithms, as well as their connection with generalization. My mentor at Stanford is Professor Lexing Ying.

Before joining Stanford, I obtained my PhD from the Program in Applied and Computational Mathematics at Princeton University, under the supervision of Professor Weinan E. I received my bachelor's degree from the school of mathematical Science at Peking University.

Here are my CV, Google Scholar and Linkedin

Contact me at: chaoma [at] Stanford [dot] edu

News

  • I am on the job market for 2023! Looking for Tenure-track positions.

  • Recent work: Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries. We showed that certain symmetry in the embedding space induces structures like self-attention. PDF

  • Recent work: The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks. We identified an implicit bias for a wide class of neural networks. PDF, Twitter

Recent & upcoming talks

  1. A Quasistatic Derivation of Optimization Algorithms' Exploration on the Manifold of Minima
    AMS 2022 Fall Western Sectional Meeting: upcoming on 10/22/2022

  2. Implicit bias of optimization algorithms for neural networks: static and dynamic perspectives
    Math Machine Learning seminar MPI MIS + UCLA, 10/2022

  3. Implicit biases of optimization algorithms for neural networks and their effects on generalization
    University of California, Berkeley, 10/2022

  4. Implicit biases of optimization algorithms for neural networks and their effects on generalization
    Shanghai Jiao Tong University, 10/2022

  5. Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
    Princeton University, 07/2022, Slides

  6. Nonlocal Behavior of Neural Network Loss Landscape
    ML Foundations seminar, Microsoft research, 05/2022

  7. On the linear stability of SGD and input smoothness of neural networks
    Theoretically Inclined Machine Learning (TML) Seminar, University of Ottawa, 02/2022, Slides

  8. Provably convergent quasistatic dynamics for mean-field two-player zero-sum games
    Optimal transport and Mean field games Seminar, University of South Carolina, 01/2022, Slides

  9. Provably convergent quasistatic dynamics for mean-field two-player zero-sum games
    Stanford Applied Math Seminar, 01/2022

All talks

Recent & selected publications

  1. Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
    Chao Ma, Lexing Ying, arXiv: 2210.06741, PDF

  2. The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
    Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli, arXiv: 2210.03820, PDF

  3. Correcting Convexity Bias in Function and Functional Estimate
    Chao Ma, Lexing Ying, arXiv: 2208.07996, PDF


  1. Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
    Mingze Wang, Chao Ma, arXiv: 2206.02139, PDF


  1. Generalization Error Bounds for Deep Neural Networks Trained by SGD
    Mingze Wang, Chao Ma, arXiv: 2206.03299, PDF


  1. Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscape
    Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying, Journal of Machine Learning, PDF


  1. Provably Convergent Quasistatic Dynamics for Mean-Field Two-Player Zero-Sum Games
    Chao Ma, Lexing Ying, ICLR 2022, PDF


  1. A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization
    Chao Ma, Lexing Ying, Research in the Mathematical Sciences volume 9, Article number: 47 (2022), PDF


  1. On Linear Stability of SGD and Input-Smoothness of Neural Networks
    Chao Ma, Lexing Ying, Neurips 2021, PDF


  1. A mean-field analysis of deep resnet and beyond: Towards provable optimization via over-parameterization from depth
    Yuping Lu, Chao Ma, Yulong LU, Jianfeng Lu, Lexing Ying, ICML 2020. PDF


  1. Uniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations
    Jiequn Han, Chao Ma, Zheng Ma, Weinan E, Proceedings of the National Academy of Sciences (2019): 201909854. PDF


  1. Barron Spaces and the Flow-induced Function Spaces for Neural Network Models
    Weinan E, Chao Ma, Lei Wu, Constructive Approximation. PDF


  1. A comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics
    Weinan E, Chao Ma, Lei Wu, Science China Mathematics 63 (7): 1235–1258, PDF


  1. A priori estimates of the population risk for two-layer neural networks
    Weinan E, Chao Ma, Lei Wu, Communications in Mathematical Sciences 17 (5), 1407-1425, PDF


  1. How SGD Selects the Global Minima in Over-parameterized Learning: A Stability Perspective
    Lei Wu, Chao Ma, Weinan E, Neurips 2018, PDF

All publications