no code implementations • 13 Apr 2024 • MengNan Qi, Yufan Huang, Yongqiang Yao, Maoquan Wang, Bin Gu, Neel Sundaresan
Our experimental results reveal that following this pretraining, both Code Llama and StarCoder, the prevalent code domain pretraining models, display significant improvements on our logically equivalent code selection task and the code completion task.
no code implementations • 12 Dec 2023 • Yang Xu, Yongqiang Yao, Yufan Huang, MengNan Qi, Maoquan Wang, Bin Gu, Neel Sundaresan
Instruction tuning, a specialized technique to enhance large language model (LLM) performance via instruction datasets, relies heavily on the quality of employed data.
no code implementations • 22 Oct 2023 • MengNan Qi, Yufan Huang, Maoquan Wang, Yongqiang Yao, Zihan Liu, Bin Gu, Colin Clement, Neel Sundaresan
In this paper we introduce a new metrics for programming language translation and these metrics address these basic syntax errors.
no code implementations • 17 Oct 2023 • Yufan Huang, MengNan Qi, Yongqiang Yao, Maoquan Wang, Bin Gu, Colin Clement, Neel Sundaresan
Distilled code serves as a translation pivot for any programming language, leading by construction to parallel corpora which scale to all available source code by simply applying the distillation compiler.
no code implementations • SEMEVAL 2018 • Shiyun Chen, Maoquan Wang, Liang He
This paper presents our single model to Subtask 1 of SemEval 2018 Task 2: Emoji Prediction in English.
no code implementations • SEMEVAL 2017 • Yufei Xie, Maoquan Wang, Jing Ma, Jian Jiang, Zhao Lu
In the main Subtask C, our primary submission was ranked fourth, with a MAP of 13. 48 and accuracy of 97. 08.
no code implementations • SEMEVAL 2017 • Maoquan Wang, Shiyun Chen, Yufei Xie, Lu Zhao
This paper describes our approach for SemEval-2017 Task 4 - Sentiment Analysis in Twitter (SAT).