# Chao Ma

I am currently a Szegö Assistant Professor in the Department of Mathematics at Stanford University. My research interests lie in the theory and application of machine learning. I am especially interested in theoretically understanding the optimization behavior of deep neural networks, e.g. the implicit bias of optimization algorithms, as well as their connection with generalization. My mentor at Stanford is Professor Lexing Ying.

Before joining Stanford, I obtained my PhD from the Program in Applied and Computational Mathematics at Princeton University, under the supervision of Professor Weinan E. I received my bachelor's degree from the school of mathematical Science at Peking University.

Here are my CV, Google Scholar and Linkedin

Contact me at: chaoma [at] Stanford [dot] edu

**News**

**News**

**I am on the job market for 2023! Looking for****Tenure-track****positions.**Recent work:

**Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries**. We showed that certain symmetry in the embedding space induces structures like self-attention. PDFRecent work:

**The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks**. We identified an implicit bias for a wide class of neural networks. PDF, Twitter

## Recent & upcoming talks

**A Quasistatic Derivation of Optimization Algorithms' Exploration on the Manifold of Minima***AMS 2022 Fall Western Sectional Meeting: upcoming on**10**/22/2022***Implicit bias of optimization algorithms for neural networks: static and dynamic perspectives***Math Machine Learning seminar MPI MIS + UCLA**,**10**/2022***Implicit biases of optimization algorithms for neural networks and their effects on generalization***University of California, Berkeley**, 10/2022***Implicit biases of optimization algorithms for neural networks and their effects on generalization***Shanghai Jiao Tong Universit**y**, 10/2022***Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes***Princeton University, 07/2022,*Slides**Nonlocal Behavior of Neural Network Loss Landscape***ML Foundations seminar, Microsoft research, 05/2022***On the linear stability of SGD and input smoothness of neural networks***Theoretically Inclined Machine Learning (TML) Seminar, University of Ottawa, 02/2022,*Slides**Provably convergent quasistatic dynamics for mean-field two-player zero-sum games***Optimal transport and Mean field games Seminar, University of South Carolina, 01/2022,*Slides**Provably convergent quasistatic dynamics for mean-field two-player zero-sum games***Stanford Applied Math Seminar, 01/2022*

## Recent & selected publications

**Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries**Chao Ma, Lexing Ying,*arXiv: 2210.06741,*PDF**The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks**Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli,*arXiv: 2210.03820,*PDF**Correcting Convexity Bias in Function and Functional Estimate**Chao Ma, Lexing Ying,*arXiv: 2208.07996,*PDF

**Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks**

Mingze Wang, Chao Ma,*arXiv: 2206.02139*, PDF

**Generalization Error Bounds for Deep Neural Networks Trained by SGD**

Mingze Wang, Chao Ma,*arXiv: 2206.03299*, PDF

**Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscape**

Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying,*Journal of Machine Learning*, PDF

**Provably Convergent Quasistatic Dynamics for Mean-Field Two-Player Zero-Sum Games**Chao Ma, Lexing Ying,*ICLR 2022*, PDF

**A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization**Chao Ma, Lexing Ying,*Research in the Mathematical Sciences volume 9, Article number: 47 (2022)*, PDF

**On Linear Stability of SGD and Input-Smoothness of Neural Networks**Chao Ma, Lexing Ying,*Neurips 2021*, PDF

**A mean-field analysis of deep resnet and beyond: Towards provable optimization via over-parameterization from depth**Yuping Lu, Chao Ma, Yulong LU, Jianfeng Lu, Lexing Ying,*ICML 2020*. PDF

**Uniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations**Jiequn Han, Chao Ma, Zheng Ma, Weinan E,*Proceedings of the National Academy of Sciences (2019): 201909854.*

**Barron Spaces and the Flow-induced Function Spaces for Neural Network Models**Weinan E, Chao Ma, Lei Wu,*Constructive Approximation.*

**A comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics**Weinan E, Chao Ma, Lei Wu,*Science China Mathematics 63 (7): 1235–1258,*

**A priori estimates of the population risk for two-layer neural networks**Weinan E, Chao Ma, Lei Wu,*Communications in Mathematical Sciences 17 (5), 1407-1425,*

**How SGD Selects the Global Minima in Over-parameterized Learning: A Stability Perspective**Lei Wu, Chao Ma, Weinan E,*Neurips 2018*, PDF