Search Results for author: Weinan E

Found 73 papers, 18 papers with code

Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling

no code implementations • 1 Feb 2024 • Mingze Wang, Weinan E

We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory.

Paper
Add Code

Anchor function: a type of benchmark functions for studying language models

no code implementations • 16 Jan 2024 • Zhongwang Zhang, Zhiwei Wang, Junjie Yao, Zhangchen Zhou, Xiaolong Li, Weinan E, Zhi-Qin John Xu

However, language model research faces significant challenges, especially for academic research groups with constrained resources.

Language Modelling

Paper
Add Code

Machine-Learned Invertible Coarse Graining for Multiscale Molecular Modeling

no code implementations • 2 May 2023 • Jun Zhang, Xiaohan Lin, Weinan E, Yi Qin Gao

Multiscale molecular modeling is widely applied in scientific research of molecular properties over large time and length scales.

Paper
Add Code

MAC: A unified framework boosting low resource automatic speech recognition

no code implementations • 5 Feb 2023 • Zeping Min, Qian Ge, Zhong Li, Weinan E

Furthermore, in the ASR task, MAC beats wav2vec2 (with fine-tuning) on common voice datasets of Cantonese and gets really competitive results on common voice datasets of Taiwanese and Japanese.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

A multi-scale sampling method for accurate and robust deep neural network to predict combustion chemical kinetics

no code implementations • 9 Jan 2022 • Tianhan Zhang, Yuxiao Yi, Yifan Xu, Zhi X. Chen, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu

The current work aims to understand two basic questions regarding the deep neural network (DNN) method: what data the DNN needs and how general the DNN method can be.

Paper
Add Code

A deep learning-based model reduction (DeePMR) method for simplifying chemical kinetics

no code implementations • 6 Jan 2022 • Zhiwei Wang, Yaoyu Zhang, Enhan Zhao, Yiguang Ju, Weinan E, Zhi-Qin John Xu, Tianhan Zhang

The mechanism reduction is modeled as an optimization problem on Boolean space, where a Boolean vector, each entry corresponding to a species, represents a reduced mechanism.

Paper
Add Code

DeePN$^2$: A deep learning-based non-Newtonian hydrodynamic model

no code implementations • 29 Dec 2021 • Lidong Fang, Pei Ge, Lei Zhang, Weinan E, Huan Lei

A long standing problem in the modeling of non-Newtonian hydrodynamics of polymeric flows is the availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying micro-scale polymer dynamics.

Paper
Add Code

DeepHAM: A Global Solution Method for Heterogeneous Agent Models with Aggregate Shocks

no code implementations • 29 Dec 2021 • Jiequn Han, Yucheng Yang, Weinan E

An efficient, reliable, and interpretable global solution method, the Deep learning-based algorithm for Heterogeneous Agent Models (DeepHAM), is proposed for solving high dimensional heterogeneous agent models with aggregate shocks.

Paper
Add Code

Generalization Error of GAN from the Discriminator's Perspective

no code implementations • 8 Jul 2021 • Hongkang Yang, Weinan E

The generative adversarial network (GAN) is a well-known model for learning high-dimensional distributions, but the mechanism for its generalization ability is not understood.

Generative Adversarial Network Memorization

Paper
Add Code

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

no code implementations • 8 Jul 2021 • Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma

In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs.

Paper
Add Code

An $L^2$ Analysis of Reinforcement Learning in High Dimensions with Kernel and Neural Network Approximation

no code implementations • 15 Apr 2021 • Jihao Long, Jiequn Han, Weinan E

Reinforcement learning (RL) algorithms based on high-dimensional function approximation have achieved tremendous empirical success in large-scale problems with an enormous number of states.

Reinforcement Learning (RL)

Paper
Add Code

The Phase Diagram of a Deep Potential Water Model

no code implementations • 9 Feb 2021 • Linfeng Zhang, Han Wang, Roberto Car, Weinan E

Using the Deep Potential methodology, we construct a model that reproduces accurately the potential energy surface of the SCAN approximation of density functional theory for water, from low temperature and pressure to about 2400 K and 50 GPa, excluding the vapor stability region.

Chemical Physics

Paper
Add Code

On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers

no code implementations • 10 Dec 2020 • Weinan E, Stephan Wojtowytsch

A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer.

Paper
Add Code

Some observations on high-dimensional partial differential equations with Barron data

no code implementations • 2 Dec 2020 • Weinan E, Stephan Wojtowytsch

We use explicit representation formulas to show that solutions to certain partial differential equations lie in Barron spaces or multilayer spaces if the PDE data lie in such function spaces.

Vocal Bursts Intensity Prediction

Paper
Add Code

Generalization and Memorization: The Bias Potential Model

no code implementations • 29 Nov 2020 • Hongkang Yang, Weinan E

Models for learning probability distributions such as generative models and density estimators behave quite differently from models for learning functions.

Memorization

Paper
Add Code

A deep learning-based ODE solver for chemical kinetics

no code implementations • 24 Nov 2020 • Tianhan Zhang, Yaoyu Zhang, Weinan E, Yiguang Ju

Besides, the ignition delay time differences are within 1%.

Paper
Add Code

Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning

no code implementations • NeurIPS 2020 • Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, Weinan E

The result shows that (1) the escaping time of both SGD and ADAM~depends on the Radon measure of the basin positively and the heaviness of gradient noise negatively; (2) for the same basin, SGD enjoys smaller escaping time than ADAM, mainly because (a) the geometry adaptation in ADAM~via adaptively scaling each gradient coordinate well diminishes the anisotropic structure in gradient noise and results in larger Radon measure of a basin; (b) the exponential gradient average in ADAM~smooths its gradient and leads to lighter gradient noise tails than SGD.

Paper
Add Code

The Knowledge Graph for Macroeconomic Analysis with Alternative Big Data

no code implementations • 11 Oct 2020 • Yucheng Yang, Yue Pang, Guanhua Huang, Weinan E

The current knowledge system of macroeconomics is built on interactions among a small number of variables, since traditional macroeconomic models can mostly handle a handful of inputs.

Variable Selection

Paper
Add Code

Interpretable Neural Networks for Panel Data Analysis in Economics

no code implementations • 11 Oct 2020 • Yucheng Yang, Zhong Zheng, Weinan E

In this paper, we propose a class of interpretable neural network models that can achieve both high prediction accuracy and interpretability.

Time Series Time Series Analysis

Paper
Add Code

A priori estimates for classification problems using neural networks

no code implementations • 28 Sep 2020 • Weinan E, Stephan Wojtowytsch

We consider binary and multi-class classification problems using hypothesis classes of neural networks.

Classification General Classification +1

Paper
Add Code

Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't

no code implementations • 22 Sep 2020 • Weinan E, Chao Ma, Stephan Wojtowytsch, Lei Wu

The purpose of this article is to review the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning.

Paper
Add Code

On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis

no code implementations • ICLR 2021 • Zhong Li, Jiequn Han, Weinan E, Qianxiao Li

We study the approximation properties and optimization dynamics of recurrent neural networks (RNNs) when applied to learn input-output relationships in temporal data.

Paper
Add Code

A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms

no code implementations • 14 Sep 2020 • Chao Ma, Lei Wu, Weinan E

The dynamic behavior of RMSprop and Adam algorithms is studied through a combination of careful numerical experiments and theoretical explanations.

Paper
Add Code

OnsagerNet: Learning Stable and Interpretable Dynamics using a Generalized Onsager Principle

1 code implementation • 6 Sep 2020 • Haijun Yu, Xinyuan Tian, Weinan E, Qianxiao Li

We further apply this method to study Rayleigh-Benard convection and learn Lorenz-like low dimensional autonomous reduced order models that capture both qualitative and quantitative properties of the underlying dynamics.

Paper
Code

The Slow Deterioration of the Generalization Error of the Random Feature Model

no code implementations • 13 Aug 2020 • Chao Ma, Lei Wu, Weinan E

The random feature model exhibits a kind of resonance behavior when the number of parameters is close to the training sample size.

Paper
Add Code

On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics

no code implementations • 30 Jul 2020 • Weinan E, Stephan Wojtowytsch

The key to this work is a new way of representing functions in some form of expectations, motivated by multi-layer neural networks.

Paper
Add Code

Coarse-grained spectral projection (CGSP): a deep learning-assisted approach to quantum unitary dynamics

1 code implementation • 19 Jul 2020 • Pinchen Xie, Weinan E

We propose the coarse-grained spectral projection method (CGSP), a deep learning-assisted approach for tackling quantum unitary dynamic problems with an emphasis on quench dynamics.

Paper
Code

The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models

1 code implementation • 25 Jun 2020 • Chao Ma, Lei Wu, Weinan E

A numerical and phenomenological study of the gradient descent (GD) algorithm for training two-layer neural network models is carried out for different parameter regimes when the target function can be accurately approximated by a relatively small number of neurons.

Paper
Code

Representation formulas and pointwise properties for Barron functions

no code implementations • 10 Jun 2020 • Weinan E, Stephan Wojtowytsch

We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space) and establish different representation formulae.

Paper
Add Code

Deep Potential generation scheme and simulation protocol for the Li10GeP2S12-type superionic conductors

no code implementations • 5 Jun 2020 • Jianxing Huang, Linfeng Zhang, Han Wang, Jinbao Zhao, Jun Cheng, Weinan E

It has been a challenge to accurately simulate Li-ion diffusion processes in battery materials at room temperature using {\it ab initio} molecular dynamics (AIMD) due to its high computational cost.

Computational Physics Materials Science Chemical Physics

Paper
Add Code

Integrating Machine Learning with Physics-Based Modeling

no code implementations • 4 Jun 2020 • Weinan E, Jiequn Han, Linfeng Zhang

Machine learning is poised as a very powerful tool that can drastically improve our ability to carry out scientific research.

BIG-bench Machine Learning

Paper
Add Code

Kolmogorov Width Decay and Poor Approximators in Machine Learning: Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels

no code implementations • 21 May 2020 • Weinan E, Stephan Wojtowytsch

We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces.

Paper
Add Code

Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean field training perspective

no code implementations • 21 May 2020 • Stephan Wojtowytsch, Weinan E

Thus gradient descent training for fitting reasonably smooth, but truly high-dimensional data may be subject to the curse of dimensionality.

Paper
Add Code

Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning

1 code implementation • 1 May 2020 • Weile Jia, Han Wang, Mohan Chen, Denghui Lu, Lin Lin, Roberto Car, Weinan E, Linfeng Zhang

For 35 years, {\it ab initio} molecular dynamics (AIMD) has been the method of choice for modeling complex atomistic phenomena from first principles.

Computational Physics

1,364

Paper
Code

Machine learning based non-Newtonian fluid model with molecular fidelity

no code implementations • 7 Mar 2020 • Huan Lei, Lei Wu, Weinan E

We introduce a machine-learning-based framework for constructing continuum non-Newtonian fluid dynamics model directly from a micro-scale description.

BIG-bench Machine Learning

Paper
Add Code

Machine Learning from a Continuous Viewpoint

no code implementations • 30 Dec 2019 • Weinan E, Chao Ma, Lei Wu

We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the two-layer neural network model and the residual neural network model, can all be recovered (in a scaled form) as particular discretizations of different continuous formulations.

BIG-bench Machine Learning

Paper
Add Code

The Generalization Error of the Minimum-norm Solutions for Over-parameterized Neural Networks

no code implementations • 15 Dec 2019 • Weinan E, Chao Ma, Lei Wu

We study the generalization properties of minimum-norm solutions for three over-parametrized machine learning models including the random feature model, the two-layer neural network model and the residual network model.

BIG-bench Machine Learning

Paper
Add Code

DP-GEN: A concurrent learning platform for the generation of reliable deep learning based potential energy models

1 code implementation • 28 Oct 2019 • Yuzhi Zhang, Haidi Wang, WeiJie Chen, Jinzhe Zeng, Linfeng Zhang, Han Wang, Weinan E

Materials 3, 023804] and is capable of generating uniformly accurate deep learning based PES models in a way that minimizes human intervention and the computational cost for data generation and model training.

Computational Physics

275

Paper
Code

A mathematical model for universal semantics

1 code implementation • 29 Jul 2019 • Weinan E, Yajun Zhou

We characterize the meaning of words with language-independent numerical fingerprints, through a mathematical analysis of recurring patterns in texts.

Question Answering Translation +1

Paper
Code

Deep neural network for Wannier function centers

1 code implementation • 27 Jun 2019 • Linfeng Zhang, Mohan Chen, Xifan Wu, Han Wang, Weinan E, Roberto Car

We introduce a deep neural network (DNN) model that assigns the position of the centers of the electronic charge in each atomic configuration on a molecular dynamics trajectory.

Computational Physics Materials Science Chemical Physics

Paper
Code

The Barron Space and the Flow-induced Function Spaces for Neural Network Models

no code implementations • 18 Jun 2019 • Weinan E, Chao Ma, Lei Wu

We define the Barron space and show that it is the right space for two-layer neural network models in the sense that optimal direct and inverse approximation theorems hold for functions in the Barron space.

BIG-bench Machine Learning

Paper
Add Code

Monge-Amp\`ere Flow for Generative Modeling

no code implementations • ICLR 2019 • Linfeng Zhang, Weinan E, Lei Wang

We present a deep generative model, named Monge-Amp\`ere flow, which builds on continuous-time gradient flow arising from the Monge-Amp\`ere equation in optimal transport theory.

Density Estimation

Paper
Add Code

A Priori Estimates of the Generalization Error for Two-layer Neural Networks

no code implementations • ICLR 2019 • Lei Wu, Chao Ma, Weinan E

These new estimates are a priori in nature in the sense that the bounds depend only on some norms of the underlying functions to be fitted, not the parameters in the model.

Paper
Add Code

Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections

no code implementations • 10 Apr 2019 • Weinan E, Chao Ma, Qingcan Wang, Lei Wu

In addition, it is also shown that the GD path is uniformly close to the functions given by the related random feature model.

Paper
Add Code

A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

no code implementations • 8 Apr 2019 • Weinan E, Chao Ma, Lei Wu

In the over-parametrized regime, it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels.

Paper
Add Code

A Priori Estimates of the Population Risk for Residual Networks

no code implementations • 6 Mar 2019 • Weinan E, Chao Ma, Qingcan Wang

An important part of the regularized model is the usage of a new path norm, called the weighted path norm, as the regularization term.

Paper
Add Code

How SGD Selects the Global Minima in Over-parameterized Learning: A Dynamical Stability Perspective

1 code implementation • NeurIPS 2018 • Lei Wu, Chao Ma, Weinan E

The question of which global minima are accessible by a stochastic gradient decent (SGD) algorithm with specific learning rate and batch size is studied from the perspective of dynamical stability.

Paper
Code

Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations

no code implementations • 5 Nov 2018 • Qianxiao Li, Cheng Tai, Weinan E

We develop the mathematical foundations of the stochastic modified equations (SME) framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is approximated by a class of stochastic differential equations with small noise parameters.

Paper
Add Code

Active Learning of Uniformly Accurate Inter-atomic Potentials for Materials Simulation

no code implementations • 28 Oct 2018 • Linfeng Zhang, De-Ye Lin, Han Wang, Roberto Car, Weinan E

An active learning procedure called Deep Potential Generator (DP-GEN) is proposed for the construction of accurate and transferable machine learning-based models of the potential energy surface (PES) for the molecular modeling of materials.

Active Learning BIG-bench Machine Learning

Paper
Add Code

A Priori Estimates of the Population Risk for Two-layer Neural Networks

no code implementations • ICLR 2019 • Weinan E, Chao Ma, Lei Wu

New estimates for the population risk are established for two-layer neural networks.

Paper
Add Code

Monge-Ampère Flow for Generative Modeling

1 code implementation • 26 Sep 2018 • Linfeng Zhang, Weinan E, Lei Wang

We present a deep generative model, named Monge-Amp\`ere flow, which builds on continuous-time gradient flow arising from the Monge-Amp\`ere equation in optimal transport theory.

Density Estimation

Paper
Code

Model Reduction with Memory and the Machine Learning of Dynamical Systems

no code implementations • 10 Aug 2018 • Chao Ma, Jianchun Wang, Weinan E

The well-known Mori-Zwanzig theory tells us that model reduction leads to memory effect.

BIG-bench Machine Learning

Paper
Add Code

Solving Many-Electron Schrödinger Equation Using Deep Neural Networks

no code implementations • 18 Jul 2018 • Jiequn Han, Linfeng Zhang, Weinan E

We introduce a new family of trial wave-functions based on deep neural networks to solve the many-electron Schr\"odinger equation.

Computational Physics Chemical Physics

Paper
Add Code

A Mean-Field Optimal Control Formulation of Deep Learning

no code implementations • 3 Jul 2018 • Weinan E, Jiequn Han, Qianxiao Li

This paper introduces the mathematical formulation of the population risk minimization problem in deep learning as a mean-field optimal control problem.

Paper
Add Code

Exponential Convergence of the Deep Neural Network Approximation for Analytic Functions

no code implementations • 1 Jul 2018 • Weinan E, Qingcan Wang

We prove that for analytic functions in low dimension, the convergence rate of the deep neural network approximation is exponential.

Paper
Add Code

End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems

1 code implementation • NeurIPS 2018 • Linfeng Zhang, Jiequn Han, Han Wang, Wissam A. Saidi, Roberto Car, Weinan E

Machine learning models are changing the paradigm of molecular modeling, which is a fundamental tool for material science, chemistry, and computational biology.

Computational Physics Materials Science Chemical Physics

1,364

Paper
Code

Understanding and Enhancing the Transferability of Adversarial Examples

no code implementations • 27 Feb 2018 • Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E

State-of-the-art deep neural networks are known to be vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs.

Paper
Add Code

Enhancing the Transferability of Adversarial Examples with Noise Reduced Gradient

no code implementations • ICLR 2018 • Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E

Deep neural networks provide state-of-the-art performance for many applications of interest.

Paper
Add Code

DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics

2 code implementations • 11 Dec 2017 • Han Wang, Linfeng Zhang, Jiequn Han, Weinan E

Here we describe DeePMD-kit, a package written in Python/C++ that has been designed to minimize the effort required to build deep learning based representation of potential energy and force field and to perform molecular dynamics.

1,364

Paper
Code

Reinforced dynamics for enhanced sampling in large atomic and molecular systems

no code implementations • 10 Dec 2017 • Linfeng Zhang, Han Wang, Weinan E

Like metadynamics, it allows for an efficient exploration of the configuration space by adding an adaptively computed biasing potential to the original dynamics.

Efficient Exploration reinforcement-learning +1

Paper
Add Code

Maximum Principle Based Algorithms for Deep Learning

2 code implementations • 26 Oct 2017 • Qianxiao Li, Long Chen, Cheng Tai, Weinan E

The continuous dynamical system approach to deep learning is explored in order to devise alternative frameworks for training algorithms.

Paper
Code

The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems

1 code implementation • 30 Sep 2017 • Weinan E, Bing Yu

We propose a deep learning based method, the Deep Ritz Method, for numerically solving variational problems, particularly the ones that arise from partial differential equations.

Paper
Code

Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations

no code implementations • 18 Sep 2017 • Christian Beck, Weinan E, Arnulf Jentzen

The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portfolio.

Portfolio Optimization

Paper
Add Code

Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics

5 code implementations • 30 Jul 2017 • Linfeng Zhang, Jiequn Han, Han Wang, Roberto Car, Weinan E

We introduce a scheme for molecular simulations, the Deep Potential Molecular Dynamics (DeePMD) method, based on a many-body potential and interatomic forces generated by a carefully crafted deep neural network trained with ab initio data.

334

Paper
Code

Solving high-dimensional partial differential equations using deep learning

6 code implementations • 9 Jul 2017 • Jiequn Han, Arnulf Jentzen, Weinan E

Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality".

Vocal Bursts Intensity Prediction

334

Paper
Code

Deep Potential: a general representation of a many-body potential energy surface

1 code implementation • 5 Jul 2017 • Jiequn Han, Linfeng Zhang, Roberto Car, Weinan E

When tested on a wide variety of examples, Deep Potential is able to reproduce the original model, whether empirical or quantum mechanics based, within chemical accuracy.

Computational Physics

1,364

Paper
Code

Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes

no code implementations • 30 Jun 2017 • Lei Wu, Zhanxing Zhu, Weinan E

It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples.

Paper
Add Code

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

5 code implementations • 15 Jun 2017 • Weinan E, Jiequn Han, Arnulf Jentzen

We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution of the BSDE.

reinforcement-learning Reinforcement Learning (RL)

245

Paper
Code

Deep Learning Approximation for Stochastic Control Problems

no code implementations • 2 Nov 2016 • Jiequn Han, Weinan E

Many real world stochastic control problems suffer from the "curse of dimensionality".

Paper
Add Code

Stochastic modified equations and adaptive stochastic gradient algorithms

no code implementations • ICML 2017 • Qianxiao Li, Cheng Tai, Weinan E

We develop the method of stochastic modified equations (SME), in which stochastic gradient algorithms are approximated in the weak sense by continuous-time stochastic differential equations.