Search Results for author: Yang Yuan

Found 39 papers, 17 papers with code

CatCode: A Comprehensive Evaluation Framework for LLMs On the Mixture of Code and Text

no code implementations • 4 Mar 2024 • Zhenru Lin, Yiqun Yao, Yang Yuan

Large language models (LLMs) such as ChatGPT are increasingly proficient in understanding and generating a mixture of code and text.

Code Translation

Paper
Add Code

Autonomous Data Selection with Language Models for Mathematical Texts

2 code implementations • 12 Feb 2024 • Yifan Zhang, Yifan Luo, Yang Yuan, Andrew Chi-Chih Yao

Our method showcases a 2 times increase in pretraining token efficiency compared to state-of-the-art baselines, underscoring the potential of our approach in enhancing models' mathematical reasoning capabilities.

Continual Pretraining GSM8K +3

17,459

Paper
Code

Meta Prompting for AI Systems

1 code implementation • 20 Nov 2023 • Yifan Zhang, Yang Yuan, Andrew Chi-Chih Yao

In this work, we present a comprehensive study of Meta Prompting (MP), an innovative technique reshaping the utilization of language models (LMs) and AI systems in problem-solving and data interaction.

Data Interaction GSM8K +2

Paper
Code

MatChat: A Large Language Model and Application Service Platform for Materials Science

no code implementations • 11 Oct 2023 • Ziyi Chen, Fankai Xie, Meng Wan, Yang Yuan, Miao Liu, Zongguo Wang, Sheng Meng, Yangang Wang

The prediction of chemical synthesis pathways plays a pivotal role in materials science research.

Language Modelling Large Language Model +2

Paper
Add Code

Information Flow in Self-Supervised Learning

2 code implementations • 29 Sep 2023 • Zhiquan Tan, Jingqin Yang, Weiran Huang, Yang Yuan, Yifan Zhang

In this paper, we provide a comprehensive toolbox for understanding and enhancing self-supervised learning (SSL) methods through the lens of matrix information theory.

Self-Supervised Learning

Paper
Code

Cumulative Reasoning with Large Language Models

1 code implementation • 8 Aug 2023 • Yifan Zhang, Jingqin Yang, Yang Yuan, Andrew Chi-Chih Yao

We demonstrate CR's superiority through several complex reasoning tasks: it outperforms existing methods in logical inference tasks with up to a 9. 3% improvement, achieving 98. 04% accuracy on the curated FOLIO wiki dataset.

Ranked #3 on Math Word Problem Solving on MATH

Decision Making Logical Reasoning +2

242

Paper
Code

Matrix Information Theory for Self-Supervised Learning

3 code implementations • 27 May 2023 • Yifan Zhang, Zhiquan Tan, Jingqin Yang, Weiran Huang, Yang Yuan

Inspired by this framework, we introduce Matrix-SSL, a novel approach that leverages matrix information theory to interpret the maximum entropy encoding loss as matrix uniformity loss.

Ranked #1 on Contrastive Learning on imagenet-1k

Contrastive Learning GSM8K +4

Paper
Code

RelationMatch: Matching In-batch Relationships for Semi-supervised Learning

1 code implementation • 17 May 2023 • Yifan Zhang, Jingqin Yang, Zhiquan Tan, Yang Yuan

Semi-supervised learning has achieved notable success by leveraging very few labeled data and exploiting the wealth of information derived from unlabeled data.

Ranked #1 on Semi-Supervised Image Classification on STL-10, 40 Labels

Semi-Supervised Image Classification

Paper
Code

On Uni-Modal Feature Learning in Supervised Multi-Modal Learning

1 code implementation • 2 May 2023 • Chenzhuang Du, Jiaye Teng, Tingle Li, Yichen Liu, Tianyuan Yuan, Yue Wang, Yang Yuan, Hang Zhao

We abstract the features (i. e. learned representations) of multi-modal data into 1) uni-modal features, which can be learned from uni-modal training, and 2) paired features, which can only be learned from cross-modal interactions.

198

Paper
Code

Contrastive Learning Is Spectral Clustering On Similarity Graph

1 code implementation • 27 Mar 2023 • Zhiquan Tan, Yifan Zhang, Jingqin Yang, Yang Yuan

Contrastive learning is a powerful self-supervised learning method, but we have a limited theoretical understanding of how it works and why it works.

Clustering Contrastive Learning +1

Paper
Code

A Categorical Framework of General Intelligence

no code implementations • 8 Mar 2023 • Yang Yuan

Can machines think?

Object

Paper
Add Code

Succinct Representations for Concepts

no code implementations • 1 Mar 2023 • Yang Yuan

Foundation models like chatGPT have demonstrated remarkable performance on various tasks.

Misconceptions

Paper
Add Code

On the Power of Foundation Models

no code implementations • 29 Nov 2022 • Yang Yuan

The second one says fine tuning does not have this limit, as a foundation model with the minimum required power (up to symmetry) can theoretically solve downstream tasks for the category defined by pretext task, with fine tuning and enough resources.

Self-Supervised Learning

Paper
Add Code

Predictive Inference with Feature Conformal Prediction

1 code implementation • 1 Oct 2022 • Jiaye Teng, Chuan Wen, Dinghuai Zhang, Yoshua Bengio, Yang Gao, Yang Yuan

Conformal prediction is a distribution-free technique for establishing valid prediction intervals.

Conformal Prediction Image Segmentation +5

Paper
Code

Anomaly Detection with Test Time Augmentation and Consistency Evaluation

no code implementations • 6 Jun 2022 • Haowei He, Jiaye Teng, Yang Yuan

Deep neural networks are known to be vulnerable to unseen data: they may wrongly assign high confidence stcores to out-distribuion samples.

Anomaly Detection Representation Learning

Paper
Add Code

Modality Laziness: Everybody's Business is Nobody's Business

no code implementations • 29 Sep 2021 • Chenzhuang Du, Jiaye Teng, Tingle Li, Yichen Liu, Yue Wang, Yang Yuan, Hang Zhao

We name this problem of multi-modal training, \emph{Modality Laziness}.

Paper
Add Code

Towards Understanding Generalization via Decomposing Excess Risk Dynamics

no code implementations • ICLR 2022 • Jiaye Teng, Jianhao Ma, Yang Yuan

Generalization is one of the fundamental issues in machine learning.

Generalization Bounds

Paper
Add Code

T-SCI: A Two-Stage Conformal Inference Algorithm with Guaranteed Coverage for Cox-MLP

1 code implementation • 8 Mar 2021 • Jiaye Teng, Zeren Tan, Yang Yuan

It is challenging to deal with censored data, where we only have access to the incomplete information of survival time instead of its exact value.

Paper
Code

Imbalance Robust Softmax for Deep Embeeding Learning

no code implementations • 23 Nov 2020 • Hao Zhu, Yang Yuan, Guosheng Hu, Xiang Wu, Neil Robertson

IR-Softmax can generalise to any softmax and its variants (which are discriminative for open-set problem) by directly setting the weights as their class centers, naturally solving the data imbalance problem.

Face Recognition Person Re-Identification

Paper
Add Code

Secure Data Sharing With Flow Model

1 code implementation • 24 Sep 2020 • Chenwei Wu, Chenzhuang Du, Yang Yuan

In the classical multi-party computation setting, multiple parties jointly compute a function without revealing their own input data.

BIG-bench Machine Learning Image Classification +1

Paper
Code

Inject Machine Learning into Significance Test for Misspecified Linear Models

no code implementations • 4 Jun 2020 • Jiaye Teng, Yang Yuan

First, we apply a machine learning method to fit the ground truth function on the training set and calculate its linear approximation.

BIG-bench Machine Learning regression

Paper
Add Code

Adversarial Data Encryption

no code implementations • 10 Feb 2020 • Yingdong Hu, Liang Zhang, Wei Shan, Xiaoxiao Qin, Jing Qi, Zhenzhou Wu, Yang Yuan

In the big data era, many organizations face the dilemma of data sharing.

Adversarial Attack BIG-bench Machine Learning

Paper
Add Code

Learning-Based Low-Rank Approximations

no code implementations • NeurIPS 2019 • Piotr Indyk, Ali Vakilian, Yang Yuan

Our experiments show that, for multiple types of data sets, a learned sketch matrix can substantially reduce the approximation loss compared to a random matrix $S$, sometimes by one order of magnitude.

Generalization Bounds

Paper
Add Code

Neural Embeddings for Nearest Neighbor Search Under Edit Distance

no code implementations • 25 Sep 2019 • Xiyuan Zhang, Yang Yuan, Piotr Indyk

The edit distance between two sequences is an important metric with many applications.

Paper
Add Code

$\ell_1$ Adversarial Robustness Certificates: a Randomized Smoothing Approach

no code implementations • 25 Sep 2019 • Jiaye Teng, Guang-He Lee, Yang Yuan

Robustness is an important property to guarantee the security of machine learning models.

Adversarial Robustness

Paper
Add Code

Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers

1 code implementation • NeurIPS 2019 • Guang-He Lee, Yang Yuan, Shiyu Chang, Tommi S. Jaakkola

Specifically, an $\ell_2$ bounded adversary cannot alter the ensemble prediction generated by an additive isotropic Gaussian noise, where the radius for the adversary depends on both the variance of the distribution as well as the ensemble margin at the point of interest.

Adversarial Robustness

Paper
Code

Asymmetric Valleys: Beyond Sharp and Flat Local Minima

1 code implementation • NeurIPS 2019 • Haowei He, Gao Huang, Yang Yuan

Specifically, at a local minimum there exist many asymmetric directions such that the loss increases abruptly along one side, and slowly along the opposite side--we formally define such minima as asymmetric valleys.

Paper
Code

Expanding Holographic Embeddings for Knowledge Completion

no code implementations • NeurIPS 2018 • Yexiang Xue, Yang Yuan, Zhitian Xu, Ashish Sabharwal

Neural models operating over structured spaces such as knowledge graphs require a continuous embedding of the discrete elements of this space (such as entities) as well as the relationships between them.

Knowledge Graphs

Paper
Add Code

Deep Multi-Task Learning to Recognise Subtle Facial Expressions of Mental States

no code implementations • ECCV 2018 • Guosheng Hu, Li Liu, Yang Yuan, Zehao Yu, Yang Hua, Zhihong Zhang, Fumin Shen, Ling Shao, Timothy Hospedales, Neil Robertson, Yongxin Yang

To advance subtle expression recognition, we contribute a Large-scale Subtle Emotions and Mental States in the Wild database (LSEMSW).

Deception Detection Facial Expression Recognition +4

Paper
Add Code

An empirical study on evaluation metrics of generative adversarial networks

4 code implementations • ICLR 2018 • Qiantong Xu, Gao Huang, Yang Yuan, Chuan Guo, Yu Sun, Felix Wu, Kilian Weinberger

Evaluating generative adversarial networks (GANs) is inherently challenging.

365

Paper
Code

An Alternative View: When Does SGD Escape Local Minima?

no code implementations • ICML 2018 • Robert Kleinberg, Yuanzhi Li, Yang Yuan

Stochastic gradient descent (SGD) is widely used in machine learning.

Paper
Add Code

Attribute-Enhanced Face Recognition With Neural Tensor Fusion Networks

no code implementations • ICCV 2017 • Guosheng Hu, Yang Hua, Yang Yuan, Zhihong Zhang, Zheng Lu, Sankha S. Mukherjee, Timothy M. Hospedales, Neil M. Robertson, Yongxin Yang

To solve this problem, we establish a theoretical equivalence between tensor optimisation and a two-stream gated neural network.

Attribute Face Recognition

Paper
Add Code

Hyperparameter Optimization: A Spectral Approach

1 code implementation • ICLR 2018 • Elad Hazan, Adam Klivans, Yang Yuan

In particular, we obtain the first quasi-polynomial time algorithm for learning noisy decision trees with polynomial sample complexity.

Bayesian Optimization Hyperparameter Optimization

173

Paper
Code

Convergence Analysis of Two-layer Neural Networks with ReLU Activation

no code implementations • NeurIPS 2017 • Yuanzhi Li, Yang Yuan

We also show that the identity mapping is necessary for convergence, as it moves the initial point to a better place for optimization.

Vocal Bursts Valence Prediction

Paper
Add Code

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters

no code implementations • NeurIPS 2016 • Zeyuan Allen-Zhu, Yang Yuan, Karthik Sridharan

The amount of data available in the world is growing faster than our ability to deal with it.

BIG-bench Machine Learning Clustering

Paper
Add Code

Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

no code implementations • 30 Dec 2015 • Zeyuan Allen-Zhu, Zheng Qu, Peter Richtárik, Yang Yuan

Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems.

Paper
Add Code

Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives

3 code implementations • 5 Jun 2015 • Zeyuan Allen-Zhu, Yang Yuan

Many classical algorithms are found until several years later to outlive the confines in which they were conceived, and continue to be relevant in unforeseen settings.

regression

Paper
Code

Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition

1 code implementation • 6 Mar 2015 • Rong Ge, Furong Huang, Chi Jin, Yang Yuan

To the best of our knowledge this is the first work that gives global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points.

Tensor Decomposition

Paper
Code

Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms

no code implementations • 31 Jul 2014 • Wei Chen, Yajun Wang, Yang Yuan, Qinshi Wang

The objective of an online learning algorithm for CMAB is to minimize (\alpha,\beta)-approximation regret, which is the difference between the \alpha{\beta} fraction of the expected reward when always playing the optimal super arm, and the expected reward of playing super arms according to the algorithm.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.