TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Code Generation	CoNaLa	RoBERTaMarian	BLEU	35.74	# 2
Code Generation	CoNaLa	RoBERTaMarian	Exact Match Accuracy	13.8	# 1
Code Generation	CoNaLa	LUKEMarian	BLEU	29.83	# 12
Code Generation	CoNaLa	LUKEMarian	Exact Match Accuracy	7.6	# 5
Code Generation	CoNaLa	ELECTRAMarian	BLEU	30.18	# 10
Code Generation	CoNaLa	ELECTRAMarian	Exact Match Accuracy	10.0	# 4
Code Generation	CoNaLa	BERTMarian	BLEU	32.46	# 6
Code Generation	CoNaLa	BERTMarian	Exact Match Accuracy	12.40	# 2
Code Generation	Django	LUKEMarian	Accuracy	78.50	# 5
Code Generation	Django	LUKEMarian	BLEU Score	89.34	# 2
Code Generation	Django	RoBERTaMarian	Accuracy	77.95	# 6
Code Generation	Django	RoBERTaMarian	BLEU Score	88.91	# 3
Code Generation	Django	ELECTRAMarian	Accuracy	65.32	# 9
Code Generation	Django	ELECTRAMarian	BLEU Score	53.02	# 7
Code Generation	Django	BERTMarian	Accuracy	76.68	# 7
Code Generation	Django	BERTMarian	BLEU Score	56.55	# 6

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/leveraging-pre-trained-language-models-for-3/code-generation-on-conala)](https://paperswithcode.com/sota/code-generation-on-conala?p=leveraging-pre-trained-language-models-for-3)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/leveraging-pre-trained-language-models-for-3/code-generation-on-django)](https://paperswithcode.com/sota/code-generation-on-django?p=leveraging-pre-trained-language-models-for-3)`

Leveraging pre-trained language models for code generation

Complex & Intelligent Systems 2024 · Ahmed Soliman, Samir Shaheen, Mayada Hadhoud ·

Code assistance refers to the utilization of various tools, techniques, and models to help developers in the process of software development. As coding tasks become increasingly complex, code assistant plays a pivotal role in enhancing developer productivity, reducing errors, and facilitating a more efficient coding workflow. This assistance can manifest in various forms, including code autocompletion, error detection and correction, code generation, documentation support, and context-aware suggestions. Language models have emerged as integral components of code assistance, offering developers the capability to receive intelligent suggestions, generate code snippets, and enhance overall coding proficiency. In this paper, we propose new hybrid models for code generation by leveraging pre-trained language models BERT, RoBERTa, ELECTRA, and LUKE with the Marian Causal Language Model. Selecting these models based on their strong performance in various natural language processing tasks. We evaluate the performance of these models on two datasets CoNaLa and DJANGO and compare them to existing state-of-the-art models. We aim to investigate the potential of pre-trained transformer language models to revolutionize code generation, offering improved precision and efficiency in navigating complex coding scenarios. Additionally, conducting error analysis and refining the generated code. Our results show that these models, when combined with the Marian Decoder, significantly improve code generation accuracy and efficiency. Notably, the RoBERTaMarian model achieved a maximum BLEU score of 35.74 and an exact match accuracy of 13.8% on CoNaLa, while LUKE-Marian attained a BLEU score of 89.34 and an exact match accuracy of 78.50% on DJANGO. Implementation of this work is available at https://github.com/AhmedSSoliman/Leveraging-Pretrained-Language-Models-for-Code-Generation.

PDF Abstract

Code

Add Remove Mark official

AhmedSSoliman/Leveraging-Pretrained…

↳ Quickstart in

Colab

Tasks

Add Remove

Code Generation

Language Modelling

Datasets

CoNaLa

Django

Results from the Paper

Add Remove

Ranked #2 on Code Generation on CoNaLa

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Code Generation	CoNaLa	RoBERTaMarian	BLEU	35.74	# 2	Compare
Code Generation	CoNaLa	RoBERTaMarian	Exact Match Accuracy	13.8	# 1	Compare
Code Generation	CoNaLa	LUKEMarian	BLEU	29.83	# 12	Compare
Code Generation	CoNaLa	LUKEMarian	Exact Match Accuracy	7.6	# 5	Compare
Code Generation	CoNaLa	ELECTRAMarian	BLEU	30.18	# 10	Compare
Code Generation	CoNaLa	ELECTRAMarian	Exact Match Accuracy	10.0	# 4	Compare
Code Generation	CoNaLa	BERTMarian	BLEU	32.46	# 6	Compare
Code Generation	CoNaLa	BERTMarian	Exact Match Accuracy	12.40	# 2	Compare
Code Generation	Django	LUKEMarian	Accuracy	78.50	# 5	Compare
Code Generation	Django	LUKEMarian	BLEU Score	89.34	# 2	Compare
Code Generation	Django	RoBERTaMarian	Accuracy	77.95	# 6	Compare
Code Generation	Django	RoBERTaMarian	BLEU Score	88.91	# 3	Compare
Code Generation	Django	ELECTRAMarian	Accuracy	65.32	# 9	Compare
Code Generation	Django	ELECTRAMarian	BLEU Score	53.02	# 7	Compare
Code Generation	Django	BERTMarian	Accuracy	76.68	# 7	Compare
Code Generation	Django	BERTMarian	BLEU Score	56.55	# 6	Compare

Methods

Add Remove

Adam • Attention Dropout • BERT • Dense Connections • Dropout • ELECTRA • GELU • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • Multi-Head Attention • Residual Connection • RoBERTa • Scaled Dot-Product Attention • Softmax • Weight Decay • WordPiece

Edit Social Preview

Leveraging pre-trained language models for code generation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove