TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Code Search	CodeSearchNet	CodeT5+ 770M	Overall	77.4	# 3
Code Search	CodeSearchNet	CodeT5+ 770M	Go	92.7	# 3
Code Search	CodeSearchNet	CodeT5+ 770M	Ruby	78.0	# 3
Code Search	CodeSearchNet	CodeT5+ 770M	Python	75.8	# 5
Code Search	CodeSearchNet	CodeT5+ 770M	Java	76.2	# 4
Code Search	CodeSearchNet	CodeT5+ 770M	JS	71.3	# 4
Code Search	CodeSearchNet	CodeT5+ 770M	PHP	70.1	# 5
Code Search	CodeSearchNet	CodeT5+ 220M	Overall	77.1	# 5
Code Search	CodeSearchNet	CodeT5+ 220M	Go	92.4	# 4
Code Search	CodeSearchNet	CodeT5+ 220M	Ruby	77.7	# 4
Code Search	CodeSearchNet	CodeT5+ 220M	Python	75.6	# 6
Code Search	CodeSearchNet	CodeT5+ 220M	Java	76.1	# 5
Code Search	CodeSearchNet	CodeT5+ 220M	JS	70..8	# 6
Code Search	CodeSearchNet	CodeT5+ 220M	PHP	69.8	# 6
Code Search	CodeXGLUE - AdvTest	CodeT5+ 220M	MRR	43.3	# 2
Code Search	CodeXGLUE - AdvTest	CodeT5+ 770M	MRR	44.7	# 1
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 770M	Ruby	15.63	# 2
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 770M	Javascript	17.93	# 1
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 770M	Python	20.47	# 2
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 770M	Java	20.83	# 2
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 770M	PHP	26.39	# 3
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 220M	Ruby	15.51	# 3
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 220M	Javascript	16.27	# 2
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 220M	Go	19.60	# 2
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 220M	Python	20.16	# 4
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 220M	Java	20.53	# 3
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 220M	PHP	26.78	# 1
Code Completion	CodeXGLUE - Github Java Corpus	CodeT5+ 220M	EM (line-level)	35.17	# 2
Code Completion	CodeXGLUE - Github Java Corpus	CodeT5+ 220M	Edit Sim (line-level)	69.48	# 2
Code Completion	CodeXGLUE - Github Java Corpus	CodeT5+ 770M	EM (line-level)	37.90	# 1
Code Completion	CodeXGLUE - Github Java Corpus	CodeT5+ 770M	Edit Sim (line-level)	72.25	# 1
Code Completion	CodeXGLUE - PY150	CodeT5+ 770M	EM (line-level)	44.86	# 1
Code Completion	CodeXGLUE - PY150	CodeT5+ 770M	Edit Sim (line-level)	74.22	# 1
Code Completion	CodeXGLUE - PY150	CodeT5+ 220M	EM (line-level)	43.42	# 2
Code Completion	CodeXGLUE - PY150	CodeT5+ 220M	Edit Sim (line-level)	73.69	# 2
Arithmetic Reasoning	GSM8K	CodeT5+	Accuracy	73.8	# 84
Arithmetic Reasoning	GSM8K	CodeT5+	Parameters (Billion)	0.77	# 6
Code Generation	HumanEval	CodeT5+ 220M (zero-shot)	Pass@1	12.0	# 118
Code Generation	HumanEval	CodeT5+ 770M (zero-shot)	Pass@1	15.5	# 110
Code Generation	HumanEval	CodeT5+ 2B (zero-shot)	Pass@1	24.2	# 90
Code Generation	HumanEval	CodeT5+ 6B (zero-shot)	Pass@1	28.0	# 86
Code Generation	HumanEval	CodeT5+ 16B (zero-shot)	Pass@1	30.9	# 76
Code Generation	HumanEval	CodeT5+ 16B (CodeT)	Pass@1	38.5	# 63
Code Generation	HumanEval	InstructCodeT5+ 16B (CodeT)	Pass@1	42.9	# 57
Code Generation	HumanEval	InstructCodeT5+ 16B (zero-shot)	Pass@1	35.0	# 70

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codet5-open-code-large-language-models-for/code-search-on-codexglue-advtest)](https://paperswithcode.com/sota/code-search-on-codexglue-advtest?p=codet5-open-code-large-language-models-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codet5-open-code-large-language-models-for/code-summarization-on-codexglue-codesearchnet)](https://paperswithcode.com/sota/code-summarization-on-codexglue-codesearchnet?p=codet5-open-code-large-language-models-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codet5-open-code-large-language-models-for/code-completion-on-codexglue-github-java)](https://paperswithcode.com/sota/code-completion-on-codexglue-github-java?p=codet5-open-code-large-language-models-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codet5-open-code-large-language-models-for/code-completion-on-codexglue-py150)](https://paperswithcode.com/sota/code-completion-on-codexglue-py150?p=codet5-open-code-large-language-models-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codet5-open-code-large-language-models-for/code-search-on-codesearchnet)](https://paperswithcode.com/sota/code-search-on-codesearchnet?p=codet5-open-code-large-language-models-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codet5-open-code-large-language-models-for/code-generation-on-humaneval)](https://paperswithcode.com/sota/code-generation-on-humaneval?p=codet5-open-code-large-language-models-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codet5-open-code-large-language-models-for/arithmetic-reasoning-on-gsm8k)](https://paperswithcode.com/sota/arithmetic-reasoning-on-gsm8k?p=codet5-open-code-large-language-models-for)`

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

13 May 2023 · Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D. Q. Bui, Junnan Li, Steven C. H. Hoi ·

Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence. However, existing code LLMs have two main limitations in terms of architecture and pretraining tasks. First, they often adopt a specific architecture (encoder-only or decoder-only) or rely on a unified encoder-decoder network for different downstream tasks. The former paradigm is limited by inflexibility in applications while in the latter, the model is treated as a single system for all tasks, leading to suboptimal performance on a subset of tasks. Secondly, they often employ a limited set of pretraining objectives which might not be relevant to some downstream tasks and hence result in substantial performance degrade. To address these limitations, we propose ``CodeT5+'', a family of encoder-decoder LLMs for code in which component modules can be flexibly combined to suit a wide range of downstream code tasks. Such flexibility is enabled by our proposed mixture of pretraining objectives to mitigate the pretrain-finetune discrepancy. These objectives cover span denoising, contrastive learning, text-code matching, and causal LM pretraining tasks, on both unimodal and bimodal multilingual code corpora. Furthermore, we propose to initialize CodeT5+ with frozen off-the-shelf LLMs without training from scratch to efficiently scale up our models, and explore instruction-tuning to align with natural language instructions. We extensively evaluate CodeT5+ on over 20 code-related benchmarks in different settings, including zero-shot, finetuning, and instruction-tuning. We observe state-of-the-art (SoTA) model performance on various code-related tasks, such as code generation and completion, math programming, and text-to-code retrieval tasks. Particularly, our instruction-tuned CodeT5+ 16B achieves new SoTA results on HumanEval code generation task against other open code LLMs.

PDF Abstract

Code

Add Remove Mark official

salesforce/codet5 official

2,595

Tasks

Add Remove

Arithmetic Reasoning

Code Completion

Code Generation

Code Search

Code Summarization

Math

Datasets

GSM8K

HumanEval CodeSearchNet

CodeXGLUE

Results from the Paper

Edit

Ranked #1 on Code Search on CodeXGLUE - AdvTest

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Code Search	CodeSearchNet	CodeT5+ 770M	Overall	77.4	# 3	Compare
			Go	92.7	# 3	Compare
			Ruby	78.0	# 3	Compare
			Python	75.8	# 5	Compare
			Java	76.2	# 4	Compare
			JS	71.3	# 4	Compare
			PHP	70.1	# 5	Compare
Code Search	CodeSearchNet	CodeT5+ 220M	Overall	77.1	# 5	Compare
			Go	92.4	# 4	Compare
			Ruby	77.7	# 4	Compare
			Python	75.6	# 6	Compare
			Java	76.1	# 5	Compare
			JS	70..8	# 6	Compare
			PHP	69.8	# 6	Compare
Code Search	CodeXGLUE - AdvTest	CodeT5+ 220M	MRR	43.3	# 2	Compare
Code Search	CodeXGLUE - AdvTest	CodeT5+ 770M	MRR	44.7	# 1	Compare
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 770M	Ruby	15.63	# 2	Compare
			Javascript	17.93	# 1	Compare
			Python	20.47	# 2	Compare
			Java	20.83	# 2	Compare
			PHP	26.39	# 3	Compare
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5+ 220M	Ruby	15.51	# 3	Compare
			Javascript	16.27	# 2	Compare
			Go	19.60	# 2	Compare
			Python	20.16	# 4	Compare
			Java	20.53	# 3	Compare
			PHP	26.78	# 1	Compare
Code Completion	CodeXGLUE - Github Java Corpus	CodeT5+ 220M	EM (line-level)	35.17	# 2	Compare
Code Completion	CodeXGLUE - Github Java Corpus	CodeT5+ 220M	Edit Sim (line-level)	69.48	# 2	Compare
Code Completion	CodeXGLUE - Github Java Corpus	CodeT5+ 770M	EM (line-level)	37.90	# 1	Compare
Code Completion	CodeXGLUE - Github Java Corpus	CodeT5+ 770M	Edit Sim (line-level)	72.25	# 1	Compare
Code Completion	CodeXGLUE - PY150	CodeT5+ 770M	EM (line-level)	44.86	# 1	Compare
Code Completion	CodeXGLUE - PY150	CodeT5+ 770M	Edit Sim (line-level)	74.22	# 1	Compare
Code Completion	CodeXGLUE - PY150	CodeT5+ 220M	EM (line-level)	43.42	# 2	Compare
Code Completion	CodeXGLUE - PY150	CodeT5+ 220M	Edit Sim (line-level)	73.69	# 2	Compare
Arithmetic Reasoning	GSM8K	CodeT5+	Accuracy	73.8	# 84	Compare
Arithmetic Reasoning	GSM8K	CodeT5+	Parameters (Billion)	0.77	# 6	Compare
Code Generation	HumanEval	CodeT5+ 220M (zero-shot)	Pass@1	12.0	# 118	Compare
Code Generation	HumanEval	CodeT5+ 770M (zero-shot)	Pass@1	15.5	# 110	Compare
Code Generation	HumanEval	CodeT5+ 2B (zero-shot)	Pass@1	24.2	# 90	Compare
Code Generation	HumanEval	CodeT5+ 6B (zero-shot)	Pass@1	28.0	# 86	Compare
Code Generation	HumanEval	CodeT5+ 16B (zero-shot)	Pass@1	30.9	# 76	Compare
Code Generation	HumanEval	CodeT5+ 16B (CodeT)	Pass@1	38.5	# 63	Compare
Code Generation	HumanEval	InstructCodeT5+ 16B (CodeT)	Pass@1	42.9	# 57	Compare
Code Generation	HumanEval	InstructCodeT5+ 16B (zero-shot)	Pass@1	35.0	# 70	Compare

Methods

Add Remove

ALIGN

Edit Social Preview

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove