no code implementations • 6 Oct 2023 • Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.
no code implementations • 3 Oct 2023 • Benjamin Steenhoek, Michele Tufano, Neel Sundaresan, Alexey Svyatkovskiy
Software testing is a crucial aspect of software development, and the creation of high-quality tests that adhere to best practices is essential for effective maintenance.
1 code implementation • 8 May 2023 • Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, Nan Duan
Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code.
no code implementations • 29 Aug 2022 • Andrei Zlotchevski, Dawn Drain, Alexey Svyatkovskiy, Colin Clement, Neel Sundaresan, Michele Tufano
Large Transformer models achieved the state-of-the-art status for Natural Language Understanding tasks and are increasingly becoming the baseline model architecture for modeling source code.
no code implementations • 23 May 2022 • Xiaoyu Liu, Jinu Jang, Neel Sundaresan, Miltiadis Allamanis, Alexey Svyatkovskiy
This scenario motivates the code adaptation task -- a variant of program repair which aims to adapt variable identifiers in a pasted snippet of code to the surrounding, preexisting source code.
2 code implementations • 17 Mar 2022 • Zhiyu Li, Shuai Lu, Daya Guo, Nan Duan, Shailesh Jannu, Grant Jenks, Deep Majumder, Jared Green, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan
In this research, we focus on utilizing pre-training techniques for the tasks in the code review scenario.
1 code implementation • ACL 2022 • Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung-won Hwang, Alexey Svyatkovskiy
Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development.
no code implementations • EMNLP 2021 • Colin B. Clement, Shuai Lu, Xiaoyu Liu, Michele Tufano, Dawn Drain, Nan Duan, Neel Sundaresan, Alexey Svyatkovskiy
While there are many efforts to extend the context window, we introduce an architecture-independent approach for leveraging the syntactic hierarchies of source code for incorporating entire file-level context into a fixed-length window.
1 code implementation • 31 Aug 2021 • Alexey Svyatkovskiy, Sarah Fakhoury, Negar Ghorbani, Todd Mytkowicz, Elizabeth Dinella, Christian Bird, Jinu Jang, Neel Sundaresan, Shuvendu Lahiri
Our model achieves 63-68% accuracy for merge resolution synthesis, yielding nearly a 3x performance improvement over existing semi-structured, and 2x improvement over neural program merge tools.
no code implementations • ICLR 2022 • Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, Miltiadis Allamanis
To evaluate models, we consider both ROUGE as well as a new metric RegexAcc that measures success of generating completions matching long outputs with as few holes as possible.
no code implementations • 16 Apr 2021 • Dawn Drain, Chen Wu, Alexey Svyatkovskiy, Neel Sundaresan
In this work we introduce DeepDebug: a data-driven program repair approach which learns to detect and fix bugs in Java methods mined from real-world GitHub repositories.
4 code implementations • 9 Feb 2021 • Shuai Lu, Daya Guo, Shuo Ren, JunJie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu
Benchmark datasets have a significant impact on accelerating research in programming language tasks.
Ranked #1 on Cloze Test on CodeXGLUE - CT-maxmin
no code implementations • EMNLP 2020 • Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
Simultaneously modeling source code and natural language has many exciting applications in automated software development and understanding.
1 code implementation • ICLR 2021 • Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou
Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.
Ranked #3 on Type prediction on ManyTypes4TypeScript
no code implementations • 11 Sep 2020 • Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Neel Sundaresan
In this paper we present an approach to support developers in writing unit test cases by generating accurate and useful assert statements.
1 code implementation • 11 Sep 2020 • Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, Neel Sundaresan
We execute the test cases, collect test coverage information, and compare them with test cases generated by EvoSuite and GPT-3, finding that our approach outperforms GPT-3 and has comparable coverage w. r. t.
no code implementations • 16 May 2020 • Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
In software development through integrated development environments (IDEs), code completion is one of the most widely used features.
no code implementations • 28 Apr 2020 • Alexey Svyatkovskiy, Sebastian Lee, Anna Hadjitofi, Maik Riechert, Juliana Franco, Miltiadis Allamanis
Code completion is one of the most widely used features of modern integrated development environments (IDEs).
no code implementations • 2 Dec 2019 • Alexey Svyatkovskiy, Kosuke Imai, Mary Kroeger, Yuki Shiraito
In this paper, we evaluate Apache Spark for a data-intensive machine learning problem.
no code implementations • 30 Nov 2019 • Alexey Svyatkovskiy, Julian Kates-Harbeck, William Tang
In this paper, we evaluate training of deep recurrent neural networks with half-precision floats.
1 code implementation • 29 Nov 2019 • Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
In this paper, we propose a novel end-to-end approach for AI-assisted code completion called Pythia.
1 code implementation • 31 Oct 2017 • Oliver Gutsche, Luca Canali, Illia Cremer, Matteo Cremonesi, Peter Elmer, Ian Fisk, Maria Girone, Bo Jayatilaka, Jim Kowalkowski, Viktor Khristenko, Evangelos Motesnitsalis, Jim Pivarski, Saba Sehrish, Kacper Surdy, Alexey Svyatkovskiy
We are presenting the progress of this 2-year project with first results of scaling up Spark-based HEP analysis.
Distributed, Parallel, and Cluster Computing