I look for the right way to understand things.
Then the solution is usually simple.

Shuxin Zheng 郑书新
Vice President, Zhongguancun Academy of AI(中关村人工智能研究院副院长)
Previously Principal Researcher at Microsoft Research

I start from mathematics, not trends. When you see the structure clearly, ideas transfer across domains that seem unrelated — from distributed optimization to molecular science to language models.

I believe in less structure, more intelligence. Strip away the scaffolding and let the model learn what matters. If a method needs too many tricks to work, it's probably wrong.

I pick problems half a step ahead — early enough to shape the direction, not so early that nobody cares. If the timing is wrong, I'd rather not do it.

The domains change. The taste doesn't.

BioEmu / DiG Science Cover 2025

Compressed protein conformational dynamics simulation from years to hours — making the Boltzmann distribution of biomolecules computationally accessible for the first time.

Graphormer NeurIPS 2021

A Transformer that natively understands graph structure. Won KDD Cup 2021 and the Open Catalyst Challenge — proving that general architectures can beat domain-specific ones.

Pre-Layer Normalization Adopted by GPT

A simple reordering of normalization layers that stabilized deep Transformer training. Quietly became a default in OpenAI's GPT series and most large language models since.

DC-ASGD ICML 2017

The first mathematically rigorous delay compensation for asynchronous SGD — solving a problem that had been patched with heuristics for years.

Now building at the intersection of research and industry — leading AI research at 中关村人工智能研究院, teaching large-model foundations at 中关村学院, and incubating the next wave of AI-native companies.