TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Program Repair	DeepFix	DrRepair + BIFI	Average Success Rate	71.7	# 1
Program Repair	GitHub-Python	Transformer	Accuracy (%)	62.0	# 2
Program Repair	GitHub-Python	Transformer + BIFI	Accuracy (%)	90.5	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/break-it-fix-it-unsupervised-learning-for/program-repair-on-deepfix)](https://paperswithcode.com/sota/program-repair-on-deepfix?p=break-it-fix-it-unsupervised-learning-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/break-it-fix-it-unsupervised-learning-for/program-repair-on-github-python)](https://paperswithcode.com/sota/program-repair-on-github-python?p=break-it-fix-it-unsupervised-learning-for)`

Break-It-Fix-It: Unsupervised Learning for Program Repair

11 Jun 2021 · Michihiro Yasunaga, Percy Liang ·

We consider repair tasks: given a critic (e.g., compiler) that assesses the quality of an input, the goal is to train a fixer that converts a bad example (e.g., code with syntax errors) into a good one (e.g., code with no syntax errors). Existing works create training data consisting of (bad, good) pairs by corrupting good examples using heuristics (e.g., dropping tokens). However, fixers trained on this synthetically-generated data do not extrapolate well to the real distribution of bad inputs. To bridge this gap, we propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas: (i) we use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data, and (ii) we train a breaker to generate realistic bad code from good code. Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data. We evaluate BIFI on two code repair datasets: GitHub-Python, a new dataset we introduce where the goal is to repair Python code with AST parse errors; and DeepFix, where the goal is to repair C code with compiler errors. BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python (+28.5%) and 71.7% on DeepFix (+5.6%). Notably, BIFI does not require any labeled data; we hope it will be a strong starting point for unsupervised learning of various repair tasks.

PDF Abstract

Code

Add Remove Mark official

michiyasunaga/bifi official

107

Tasks

Add Remove

C++ code

Code Repair

Data Augmentation

Domain Adaptation

Program Repair

Style Transfer

Unsupervised Machine Translation

Datasets

Introduced in the Paper:

GitHub-Python

Used in the Paper:

DeepFix

Results from the Paper

Edit

Ranked #1 on Program Repair on DeepFix

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Program Repair	DeepFix	DrRepair + BIFI	Average Success Rate	71.7	# 1	Compare
Program Repair	GitHub-Python	Transformer	Accuracy (%)	62.0	# 2	Compare
Program Repair	GitHub-Python	Transformer + BIFI	Accuracy (%)	90.5	# 1	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Repair • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Break-It-Fix-It: Unsupervised Learning for Program Repair

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove