Large Language Model

808 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Large Language Model models and implementations

Most implemented papers

Generative Agents: Interactive Simulacra of Human Behavior

joonspk-research/generative_agents 7 Apr 2023

Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools.

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

salesforce/CodeGen 25 Mar 2022

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.

Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data

project-baize/baize-chatbot 3 Apr 2023

Furthermore, we propose a new technique called Self-Distill with Feedback, to further improve the performance of the Baize models with feedback from ChatGPT.

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models

vision-cair/minigpt-4 20 Apr 2023

Our work, for the first time, uncovers that properly aligning the visual features with an advanced large language model can possess numerous advanced multi-modal abilities demonstrated by GPT-4, such as detailed image description generation and website creation from hand-drawn drafts.

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

lm-sys/fastchat NeurIPS 2023

Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences.

Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following

ziyuguo99/point-bind_point-llm 1 Sep 2023

We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, audio, and video.

Muse: Text-To-Image Generation via Masked Generative Transformers

lucidrains/muse-pytorch 2 Jan 2023

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for Identifying Large Language Model Generated Text

MindSpore-paper-code-3/code9 9 May 2023

As elaborated within the paper, these attributes are crucial for universal text source detection, with a particular emphasis in this paper on text produced by LLMs.

How is ChatGPT's behavior changing over time?

lchen001/llmdrift 18 Jul 2023

We find that the performance and behavior of both GPT-3. 5 and GPT-4 can vary greatly over time.

Efficient Memory Management for Large Language Model Serving with PagedAttention

vllm-project/vllm 12 Sep 2023

On top of it, we build vLLM, an LLM serving system that achieves (1) near-zero waste in KV cache memory and (2) flexible sharing of KV cache within and across requests to further reduce memory usage.