Search Results for author: Dingli Yu

Found 9 papers, 4 papers with code

Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates

1 code implementation • 28 Feb 2024 • Kaifeng Lyu, Haoyu Zhao, Xinran Gu, Dingli Yu, Anirudh Goyal, Sanjeev Arora

Public LLMs such as the Llama 2-Chat have driven huge activity in LLM research.

Paper
Code

Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models

no code implementations • 26 Oct 2023 • Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal, Sanjeev Arora

The paper develops a methodology for (a) designing and administering such an evaluation, and (b) automatic grading (plus spot-checking by humans) of the results using GPT-4 as well as the open LLaMA-2 70B model.

Paper
Add Code

Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks

no code implementations • 3 Oct 2023 • Greg Yang, Dingli Yu, Chen Zhu, Soufiane Hayou

By classifying infinite-width neural networks and identifying the *optimal* limit, Tensor Programs IV and V demonstrated a universal way, called $\mu$P, for *widthwise hyperparameter transfer*, i. e., predicting optimal hyperparameters of wide neural networks from narrow ones.

Paper
Add Code

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

1 code implementation • 5 Nov 2022 • Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora

Saliency methods compute heat maps that highlight portions of an input that were most {\em important} for the label assigned to it by a deep net.

Paper
Code

A Kernel-Based View of Language Model Fine-Tuning

1 code implementation • 11 Oct 2022 • Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora

It has become standard to solve NLP tasks by fine-tuning pre-trained language models (LMs), especially in low-data settings.

Language Modelling

Paper
Code

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic and Sound

no code implementations • 29 Sep 2021 • Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora

Saliency methods seek to provide human-interpretable explanations for the output of machine learning model on a given input.

Paper
Add Code

Enhanced Convolutional Neural Tangent Kernels

no code implementations • 3 Nov 2019 • Zhiyuan Li, Ruosong Wang, Dingli Yu, Simon S. Du, Wei Hu, Ruslan Salakhutdinov, Sanjeev Arora

An exact algorithm to compute CNTK (Arora et al., 2019) yielded the finding that classification accuracy of CNTK on CIFAR-10 is within 6-7% of that of that of the corresponding CNN architecture (best figure being around 78%) which is interesting performance for a fixed kernel.

Data Augmentation regression

Paper
Add Code

Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

4 code implementations • ICLR 2020 • Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu

On VOC07 testbed for few-shot image classification tasks on ImageNet with transfer learning (Goyal et al., 2019), replacing the linear SVM currently used with a Convolutional NTK SVM consistently improves performance.

Few-Shot Image Classification General Classification +3

Paper
Code

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

no code implementations • ICLR 2020 • Wei Hu, Zhiyuan Li, Dingli Yu

Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.