Chao Ma

I joined D.E. Shaw Group as a quant analyst in 2023/07!

Previously, I was a Szegö Assistant Professor in the Department of Mathematics at Stanford University. My research interests lie in the theory and application of machine learning. I am especially interested in theoretically understanding the optimization behavior of deep neural networks, e.g. the implicit bias of optimization algorithms, as well as their connection with generalization. My mentor at Stanford is Professor Lexing Ying.

Before joining Stanford, I obtained my PhD from the Program in Applied and Computational Mathematics at Princeton University, under the supervision of Professor Weinan E. I received my bachelor's degree from the school of mathematical Science at Peking University.

Here are my CV, Google Scholar and Linkedin

Contact me at: chaom123x [at] gmail [dot] com

Recent & upcoming talks

A Quasistatic Derivation of Optimization Algorithms' Exploration on the Manifold of Minima
AMS 2022 Fall Western Sectional Meeting: upcoming on 10/22/2022
Implicit bias of optimization algorithms for neural networks: static and dynamic perspectives
Math Machine Learning seminar MPI MIS + UCLA, 10/2022
Implicit biases of optimization algorithms for neural networks and their effects on generalization
University of California, Berkeley, 10/2022
Implicit biases of optimization algorithms for neural networks and their effects on generalization
Shanghai Jiao Tong University, 10/2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Princeton University, 07/2022, Slides
Nonlocal Behavior of Neural Network Loss Landscape
ML Foundations seminar, Microsoft research, 05/2022
On the linear stability of SGD and input smoothness of neural networks
Theoretically Inclined Machine Learning (TML) Seminar, University of Ottawa, 02/2022, Slides
Provably convergent quasistatic dynamics for mean-field two-player zero-sum games
Optimal transport and Mean field games Seminar, University of South Carolina, 01/2022, Slides
Provably convergent quasistatic dynamics for mean-field two-player zero-sum games
Stanford Applied Math Seminar, 01/2022

All talks

Recent & selected publications

Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries
Chao Ma, Lexing Ying, arXiv: 2210.06741, PDF
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli, arXiv: 2210.03820, PDF
Correcting Convexity Bias in Function and Functional Estimate
Chao Ma, Lexing Ying, arXiv: 2208.07996, PDF

Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
Mingze Wang, Chao Ma, arXiv: 2206.02139, PDF

Generalization Error Bounds for Deep Neural Networks Trained by SGD
Mingze Wang, Chao Ma, arXiv: 2206.03299, PDF

Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscape
Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying, Journal of Machine Learning, PDF

Provably Convergent Quasistatic Dynamics for Mean-Field Two-Player Zero-Sum Games
Chao Ma, Lexing Ying, ICLR 2022, PDF

A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization
Chao Ma, Lexing Ying, Research in the Mathematical Sciences volume 9, Article number: 47 (2022), PDF

On Linear Stability of SGD and Input-Smoothness of Neural Networks
Chao Ma, Lexing Ying, Neurips 2021, PDF

A mean-field analysis of deep resnet and beyond: Towards provable optimization via over-parameterization from depth
Yuping Lu, Chao Ma, Yulong LU, Jianfeng Lu, Lexing Ying, ICML 2020. PDF

Uniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations
Jiequn Han, Chao Ma, Zheng Ma, Weinan E, Proceedings of the National Academy of Sciences (2019): 201909854. PDF

Barron Spaces and the Flow-induced Function Spaces for Neural Network Models
Weinan E, Chao Ma, Lei Wu, Constructive Approximation. PDF

A comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics
Weinan E, Chao Ma, Lei Wu, Science China Mathematics 63 (7): 1235–1258, PDF

A priori estimates of the population risk for two-layer neural networks
Weinan E, Chao Ma, Lei Wu, Communications in Mathematical Sciences 17 (5), 1407-1425, PDF

How SGD Selects the Global Minima in Over-parameterized Learning: A Stability Perspective
Lei Wu, Chao Ma, Weinan E, Neurips 2018, PDF

All publications