Chao Ma
I am currently a Szegö Assistant Professor in the Department of Mathematics at Stanford University. My research interests lie in the theory and application of machine learning. I am especially interested in theoretically understanding the optimization behavior of deep neural networks, e.g. the implicit bias of optimization algorithms, as well as their connection with generalization. My mentor at Stanford is Professor Lexing Ying.
Before joining Stanford, I obtained my PhD from the Program in Applied and Computational Mathematics at Princeton University, under the supervision of Professor Weinan E. I received my bachelor's degree from the school of mathematical Science at Peking University.
Here are my CV, Google Scholar and Linkedin
Contact me at: chaoma [at] Stanford [dot] edu
News
I am on the job market for 2023! Looking for Tenure-track positions.
Recent work: Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries. We showed that certain symmetry in the embedding space induces structures like self-attention. PDF
Recent work: The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks. We identified an implicit bias for a wide class of neural networks. PDF, Twitter
Recent & upcoming talks
A Quasistatic Derivation of Optimization Algorithms' Exploration on the Manifold of Minima
AMS 2022 Fall Western Sectional Meeting: upcoming on 10/22/2022Implicit bias of optimization algorithms for neural networks: static and dynamic perspectives
Math Machine Learning seminar MPI MIS + UCLA, 10/2022Implicit biases of optimization algorithms for neural networks and their effects on generalization
University of California, Berkeley, 10/2022Implicit biases of optimization algorithms for neural networks and their effects on generalization
Shanghai Jiao Tong University, 10/2022Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Princeton University, 07/2022, SlidesNonlocal Behavior of Neural Network Loss Landscape
ML Foundations seminar, Microsoft research, 05/2022On the linear stability of SGD and input smoothness of neural networks
Theoretically Inclined Machine Learning (TML) Seminar, University of Ottawa, 02/2022, SlidesProvably convergent quasistatic dynamics for mean-field two-player zero-sum games
Optimal transport and Mean field games Seminar, University of South Carolina, 01/2022, SlidesProvably convergent quasistatic dynamics for mean-field two-player zero-sum games
Stanford Applied Math Seminar, 01/2022
Recent & selected publications
Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
Chao Ma, Lexing Ying, arXiv: 2210.06741, PDFThe Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli, arXiv: 2210.03820, PDFCorrecting Convexity Bias in Function and Functional Estimate
Chao Ma, Lexing Ying, arXiv: 2208.07996, PDF
Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
Mingze Wang, Chao Ma, arXiv: 2206.02139, PDF
Generalization Error Bounds for Deep Neural Networks Trained by SGD
Mingze Wang, Chao Ma, arXiv: 2206.03299, PDF
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscape
Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying, Journal of Machine Learning, PDF
Provably Convergent Quasistatic Dynamics for Mean-Field Two-Player Zero-Sum Games
Chao Ma, Lexing Ying, ICLR 2022, PDF
A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization
Chao Ma, Lexing Ying, Research in the Mathematical Sciences volume 9, Article number: 47 (2022), PDF
On Linear Stability of SGD and Input-Smoothness of Neural Networks
Chao Ma, Lexing Ying, Neurips 2021, PDF
A mean-field analysis of deep resnet and beyond: Towards provable optimization via over-parameterization from depth
Yuping Lu, Chao Ma, Yulong LU, Jianfeng Lu, Lexing Ying, ICML 2020. PDF
Uniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations
Jiequn Han, Chao Ma, Zheng Ma, Weinan E, Proceedings of the National Academy of Sciences (2019): 201909854. PDF
Barron Spaces and the Flow-induced Function Spaces for Neural Network Models
Weinan E, Chao Ma, Lei Wu, Constructive Approximation. PDF
A comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics
Weinan E, Chao Ma, Lei Wu, Science China Mathematics 63 (7): 1235–1258, PDF
A priori estimates of the population risk for two-layer neural networks
Weinan E, Chao Ma, Lei Wu, Communications in Mathematical Sciences 17 (5), 1407-1425, PDF
How SGD Selects the Global Minima in Over-parameterized Learning: A Stability Perspective
Lei Wu, Chao Ma, Weinan E, Neurips 2018, PDF