Search Results for author: Peiyuan Zhang

Found 9 papers, 4 papers with code

TinyLlama: An Open-Source Small Language Model

2 code implementations4 Jan 2024 Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu

We present TinyLlama, a compact 1. 1B language model pretrained on around 1 trillion tokens for approximately 3 epochs.

Computational Efficiency Language Modelling

OtterHD: A High-Resolution Multi-modality Model

1 code implementation7 Nov 2023 Bo Li, Peiyuan Zhang, Jingkang Yang, Yuanhan Zhang, Fanyi Pu, Ziwei Liu

In this paper, we present OtterHD-8B, an innovative multimodal model evolved from Fuyu-8B, specifically engineered to interpret high-resolution visual inputs with granular precision.

Visual Question Answering

One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

1 code implementation28 May 2023 Guangtao Zeng, Peiyuan Zhang, Wei Lu

Fine-tuning pre-trained language models for multiple tasks tends to be expensive in terms of storage.

Transfer Learning

Lower Generalization Bounds for GD and SGD in Smooth Stochastic Convex Optimization

no code implementations19 Mar 2023 Peiyuan Zhang, Jiaye Teng, Jingzhao Zhang

Our paper examines this observation by providing excess risk lower bounds for GD and SGD in two realizable settings: 1) $\eta T = \bigO{n}$, and (2) $\eta T = \bigOmega{n}$, where $n$ is the size of dataset.

Generalization Bounds Learning Theory

Better Few-Shot Relation Extraction with Label Prompt Dropout

1 code implementation25 Oct 2022 Peiyuan Zhang, Wei Lu

Our experiments show that our approach is able to lead to improved class representations, yielding significantly better results on the few-shot relation extraction task.

Few-Shot Learning Relation +1

Sion's Minimax Theorem in Geodesic Metric Spaces and a Riemannian Extragradient Algorithm

no code implementations13 Feb 2022 Peiyuan Zhang, Jingzhao Zhang, Suvrit Sra

Deciding whether saddle points exist or are approximable for nonconvex-nonconcave problems is usually intractable.

Rethinking the Variational Interpretation of Accelerated Optimization Methods

no code implementations NeurIPS 2021 Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand

The continuous-time model of Nesterov's momentum provides a thought-provoking perspective for understanding the nature of the acceleration phenomenon in convex optimization.

Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization

no code implementations23 Feb 2021 Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy Smith

Viewing optimization methods as numerical integrators for ordinary differential equations (ODEs) provides a thought-provoking modern framework for studying accelerated first-order optimizers.

Numerical Integration

Mixing of Stochastic Accelerated Gradient Descent

no code implementations31 Oct 2019 Peiyuan Zhang, Hadi Daneshmand, Thomas Hofmann

We study the mixing properties for stochastic accelerated gradient descent (SAGD) on least-squares regression.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.