SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives

scikit-learn/scikit-learn NeurIPS 2014

In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient algorithms with fast linear convergence rates.

Training language models to follow instructions with human feedback

ggerganov/llama.cpp 4 Mar 2022

In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.

Language Models are Few-Shot Learners

ggerganov/llama.cpp NeurIPS 2020

By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do.

 Ranked #1 on Question Answering on CoQA (Overall metric)

Common Sense Reasoning Coreference Resolution +12

Llama 2: Open Foundation and Fine-Tuned Chat Models

facebookresearch/llama 18 Jul 2023

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Arithmetic Reasoning +5

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

facebookresearch/llama 22 May 2023

Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference.

Decoder Language Modelling

Efficient Neural Audio Synthesis

CorentinJ/Real-Time-Voice-Cloning ICML 2018

The small number of weights in a Sparse WaveRNN makes it possible to sample high-fidelity audio on a mobile CPU in real time.

Audio Synthesis Speech Synthesis +1

Generalized End-to-End Loss for Speaker Verification

CorentinJ/Real-Time-Voice-Cloning 28 Oct 2017

In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss, which makes the training of speaker verification models more efficient than our previous tuple-based end-to-end (TE2E) loss function.

Domain Adaptation Speaker Verification