1 code implementation • 17 Sep 2022 • Surya Prakash Sahu, Madhurima Mandal, Shikhar Bharadwaj, Aditya Kanade, Petros Maniatis, Shirish Shevade
Compared to the existing datasets, in CodeQueries, the queries are about code semantics, the context is file level and the answers are code spans.
1 code implementation • 15 Aug 2022 • David Bieber, Kensen Shi, Petros Maniatis, Charles Sutton, Vincent Hellendoorn, Daniel Johnson, Daniel Tarlow
Graph representations of programs are commonly a central element of machine learning for code research.
1 code implementation • NeurIPS 2021 • Zimin Chen, Vincent Hellendoorn, Pascal Lamblin, Petros Maniatis, Pierre-Antoine Manzagol, Daniel Tarlow, Subhodeep Moitra
Machine learning for understanding and editing source code has recently attracted significant interest, with many developments in new models, new code representations, and new tasks. This proliferation can appear disparate and disconnected, making each approach seemingly unique and incompatible, thus obscuring the core machine learning challenges and contributions. In this work, we demonstrate that the landscape can be significantly simplified by taking a general approach of mapping a graph to a sequence of tokens and pointers. Our main result is to show that 16 recently published tasks of different shapes can be cast in this form, based on which a single model architecture achieves near or above state-of-the-art results on nearly all tasks, outperforming custom models like code2seq and alternative generic models like Transformers. This unification further enables multi-task learning and a series of cross-cutting experiments about the importance of different modeling choices for code understanding and repair tasks. The full framework, called PLUR, is easily extensible to more tasks, and will be open-sourced (https://github. com/google-research/plur).
1 code implementation • ICLR 2022 • Pardis Pashakhanloo, Aaditya Naik, Yuepeng Wang, Hanjun Dai, Petros Maniatis, Mayur Naik
Designing a suitable representation for code-reasoning tasks is challenging in aspects such as the kinds of program information to model, how to combine them, and how much context to consider.
1 code implementation • 26 Jun 2021 • Xinyun Chen, Petros Maniatis, Rishabh Singh, Charles Sutton, Hanjun Dai, Max Lin, Denny Zhou
In this work, we present the first approach for synthesizing spreadsheet formulas from tabular context, which includes both headers and semi-structured tabular data.
no code implementations • 19 Jun 2020 • Matej Balog, Rishabh Singh, Petros Maniatis, Charles Sutton
We present a new program synthesis approach that combines an encoder-decoder based synthesis architecture with a differentiable program fixer.
1 code implementation • ICLR 2020 • Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, David Bieber
By studying a popular, non-trivial program repair task, variable-misuse identification, we explore the relative merits of traditional and hybrid model families for code representation.
2 code implementations • ICML 2020 • Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
We fine-tune CuBERT on our benchmark tasks, and compare the resulting models to different variants of Word2Vec token embeddings, BiLSTM and Transformer models, as well as published state-of-the-art models, showing that CuBERT outperforms them all, even with shorter training, and with fewer labeled examples.
no code implementations • 25 Sep 2019 • Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
A major advancement in natural-language understanding has been the use of pre-trained token embeddings; BERT and other works have further shown that pre-trained contextual embeddings can be extremely powerful and can be finetuned effectively for a variety of downstream supervised tasks.
2 code implementations • ICLR 2019 • Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, Rishabh Singh
We show that it is beneficial to train a model that jointly and directly localizes and repairs variable-misuse bugs.
no code implementations • NeurIPS 2010 • Ling Huang, Jinzhu Jia, Bin Yu, Byung-Gon Chun, Petros Maniatis, Mayur Naik
Our two SPORE algorithms are able to build relationships between responses (e. g., the execution time of a computer program) and features, and select a few from hundreds of the retrieved features to construct an explicitly sparse and non-linear model to predict the response variable.