# Chao Ma

I joined D.E. Shaw Group as a quant analyst in 2023/07!

Previously, I was a Szegö Assistant Professor in the Department of Mathematics at Stanford University. My research interests lie in the theory and application of machine learning. I am especially interested in theoretically understanding the optimization behavior of deep neural networks, e.g. the implicit bias of optimization algorithms, as well as their connection with generalization. My mentor at Stanford is Professor Lexing Ying.

Before joining Stanford, I obtained my PhD from the Program in Applied and Computational Mathematics at Princeton University, under the supervision of Professor Weinan E. I received my bachelor's degree from the school of mathematical Science at Peking University.

Here are my CV, Google Scholar and Linkedin

Contact me at: chaom123x [at] gmail [dot] com

## Recent & upcoming talks

A Quasistatic Derivation of Optimization Algorithms' Exploration on the Manifold of Minima

AMS 2022 Fall Western Sectional Meeting: upcoming on 10/22/2022Implicit bias of optimization algorithms for neural networks: static and dynamic perspectives

Math Machine Learning seminar MPI MIS + UCLA, 10/2022Implicit biases of optimization algorithms for neural networks and their effects on generalization

University of California, Berkeley, 10/2022Implicit biases of optimization algorithms for neural networks and their effects on generalization

Shanghai Jiao Tong University, 10/2022Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes

Princeton University, 07/2022, SlidesNonlocal Behavior of Neural Network Loss Landscape

ML Foundations seminar, Microsoft research, 05/2022On the linear stability of SGD and input smoothness of neural networks

Theoretically Inclined Machine Learning (TML) Seminar, University of Ottawa, 02/2022, SlidesProvably convergent quasistatic dynamics for mean-field two-player zero-sum games

Optimal transport and Mean field games Seminar, University of South Carolina, 01/2022, SlidesProvably convergent quasistatic dynamics for mean-field two-player zero-sum games

Stanford Applied Math Seminar, 01/2022

## Recent & selected publications

Why Self-attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries

Chao Ma, Lexing Ying, arXiv: 2210.06741, PDFThe Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks

Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli, arXiv: 2210.03820, PDFCorrecting Convexity Bias in Function and Functional Estimate

Chao Ma, Lexing Ying, arXiv: 2208.07996, PDF

Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks

Mingze Wang, Chao Ma, arXiv: 2206.02139, PDF

Generalization Error Bounds for Deep Neural Networks Trained by SGD

Mingze Wang, Chao Ma, arXiv: 2206.03299, PDF

Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscape

Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying, Journal of Machine Learning, PDF

Provably Convergent Quasistatic Dynamics for Mean-Field Two-Player Zero-Sum Games

Chao Ma, Lexing Ying, ICLR 2022, PDF

A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization

Chao Ma, Lexing Ying, Research in the Mathematical Sciences volume 9, Article number: 47 (2022), PDF

On Linear Stability of SGD and Input-Smoothness of Neural Networks

Chao Ma, Lexing Ying, Neurips 2021, PDF

A mean-field analysis of deep resnet and beyond: Towards provable optimization via over-parameterization from depth

Yuping Lu, Chao Ma, Yulong LU, Jianfeng Lu, Lexing Ying, ICML 2020. PDF

Uniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations

Jiequn Han, Chao Ma, Zheng Ma, Weinan E, Proceedings of the National Academy of Sciences (2019): 201909854. PDF

Barron Spaces and the Flow-induced Function Spaces for Neural Network Models

Weinan E, Chao Ma, Lei Wu, Constructive Approximation. PDF

A comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics

Weinan E, Chao Ma, Lei Wu, Science China Mathematics 63 (7): 1235–1258, PDF

A priori estimates of the population risk for two-layer neural networks

Weinan E, Chao Ma, Lei Wu, Communications in Mathematical Sciences 17 (5), 1407-1425, PDF

How SGD Selects the Global Minima in Over-parameterized Learning: A Stability Perspective

Lei Wu, Chao Ma, Weinan E, Neurips 2018, PDF