no code implementations • 25 Apr 2024 • Zhensu Sun, Xiaoning Du, Zhou Yang, Li Li, David Lo
Particularly, abundant grammar tokens and formatting tokens are included to make the code more readable to humans.
no code implementations • 17 Mar 2024 • Xin Zhou, DongGyun Han, David Lo
In addition, our experimental results confirm that the simple model and complex model are complementary to each other.
no code implementations • 26 Jan 2024 • Sicong Cao, Xiaobing Sun, Ratnadira Widyasari, David Lo, Xiaoxue Wu, Lili Bo, Jiale Zhang, Bin Li, Wei Liu, Di wu, Yixin Chen
The remarkable achievements of Artificial Intelligence (AI) algorithms, particularly in Machine Learning (ML) and Deep Learning (DL), have fueled their extensive deployment across multiple sectors, including Software Engineering (SE).
no code implementations • 8 Jan 2024 • Dat Nguyen, Hieu M. Vu, Cong-Thanh Le, Bach Le, David Lo, ThanhVu Nguyen, Corina Pasareanu
To tackle the challenge of varying input structures in GNNs, GNNInfer first identifies a set of representative influential structures that contribute significantly towards the prediction of a GNN.
no code implementations • 8 Jan 2024 • Wei Hung Pan, Ming Jie Chok, Jonathan Leong Shan Wong, Yung Xin Shin, Yeong Shian Poon, Zhou Yang, Chun Yong Chong, David Lo, Mei Kuan Lim
This is achieved by generating code in response to a given question using different variants.
no code implementations • 8 Sep 2023 • David Lo
For decades, much software engineering research has been dedicated to devising automated solutions aimed at enhancing developer productivity and elevating software quality.
1 code implementation • 21 Aug 2023 • Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, Houari Sahraoui
In this paper, we deliver a comprehensive study of PEFT techniques for LLMs under the automated code generation scenario.
1 code implementation • 21 Aug 2023 • Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, Haoyu Wang
Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages.
1 code implementation • 31 May 2023 • Terry Yue Zhuo, Zhou Yang, Zhensu Sun, YuFei Wang, Li Li, Xiaoning Du, Zhenchang Xing, David Lo
This paper fills this gap by conducting a comprehensive and integrative survey of data augmentation for source code, wherein we systematically compile and encapsulate existing literature to provide a comprehensive overview of the field.
1 code implementation • 23 May 2023 • Truong Giang Nguyen, Thanh Le-Cong, Hong Jin Kang, Ratnadira Widyasari, Chengran Yang, Zhipeng Zhao, Bowen Xu, Jiayuan Zhou, Xin Xia, Ahmed E. Hassan, Xuan-Bach D. Le, David Lo
To address these challenges and boost the effectiveness of prior works, we propose MiDas (Multi-Granularity Detector for Vulnerability Fixes).
no code implementations • 6 May 2023 • Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, Houari Sahraoui
We demonstrate that the most commonly used fine-tuning technique from prior work is not robust enough to handle the dynamic nature of APIs, leading to the loss of previously acquired knowledge i. e., catastrophic forgetting.
no code implementations • 8 Mar 2023 • Aftab Hussain, Md Rafiqul Islam Rabin, Bowen Xu, David Lo, Mohammad Amin Alipour
In this paper, we explore the impact of an unsuperivsed feature enrichment approach based on variable roles on the performance of neural models of code.
no code implementations • 16 Feb 2023 • Zichong Wang, Yang Zhou, Meikang Qiu, Israat Haque, Laura Brown, Yi He, Jianwu Wang, David Lo, Wenbin Zhang
The increasing use of Machine Learning (ML) software can lead to unfair and unethical decisions, thus fairness bugs in software are becoming a growing concern.
no code implementations • 14 Feb 2023 • Roman Belaire, Pradeep Varakantham, Thanh Nguyen, David Lo
We demonstrate that our approaches provide a significant improvement in performance across a wide variety of benchmarks against leading approaches for robust Deep RL.
1 code implementation • 11 Feb 2023 • Daniel Hao Xian Yuen, Andrew Yong Chen Pang, Zhou Yang, Chun Yong Chong, Mei Kuan Lim, David Lo
To address these limitations, our tool incorporates two novel features: (1) a text transformation module to boost the number of generated test cases and uncover more errors in ASR systems and (2) a phonetic analysis module to identify on which phonemes the ASR system tend to produce errors.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 3 Jan 2023 • Thanh Le-Cong, Duc-Minh Luong, Xuan Bach D. Le, David Lo, Nhat-Hoa Tran, Bui Quang-Huy, Quyet-Thang Huynh
In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax.
1 code implementation • 12 Dec 2022 • Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyandé, Jacques Klein
Central to applying ML to software artifacts (like source or executable code) is converting them into forms suitable for learning.
1 code implementation • 7 Oct 2022 • Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Kecen Li, Arunesh Sinha, Bowen Xu, Xinwen Hou, David Lo, Tianhao Wang
Our experiments conducted on four tasks and four offline RL algorithms expose a disquieting fact: none of the existing offline RL algorithms is immune to such a backdoor attack.
1 code implementation • 7 Sep 2022 • Truong Giang Nguyen, Thanh Le-Cong, Hong Jin Kang, Xuan-Bach D. Le, David Lo
Open-source software (OSS) vulnerability management process is important nowadays, as the number of discovered OSS vulnerabilities is increasing over time.
1 code implementation • 7 Sep 2022 • Thanh Le-Cong, Hong Jin Kang, Truong Giang Nguyen, Stefanus Agus Haryono, David Lo, Xuan-Bach D. Le, Huynh Quyet Thang
Given a call graph constructed by traditional static analysis tools, AutoPruner takes a Transformer-based approach to capture the semantic relationships between the caller and callee functions associated with each edge in the call graph.
1 code implementation • 21 May 2022 • Rahul Yedida, Hong Jin Kang, Huy Tu, Xueqi Yang, David Lo, Tim Menzies
Automatically generated static code warnings suffer from a large number of false alarms.
no code implementations • 5 Apr 2022 • Mohammad Abdul Hadi, Imam Nur Bani Yusuf, Ferdian Thung, Kien Gia Luong, Jiang Lingxiao, Fatemeh H. Fard, David Lo
We have also identified two different tokenization approaches that can contribute to a significant boost in PTMs' performance for the API sequence generation task.
no code implementations • 5 Apr 2022 • Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin
Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i. e., Ruby and Java code possess very different structure.
no code implementations • 5 Apr 2022 • Rishab Sharma, Fuxiang Chen, Fatemeh Fard, David Lo
When identifiers' embeddings are used in CodeBERT, a code-based PLM, the performance is improved by 21-24% in the F1-score of clone detection.
no code implementations • 2 Mar 2022 • Jiri Gesi, SiQi Liu, Jiawei Li, Iftekhar Ahmed, Nachiappan Nagappan, David Lo, Eduardo Santana de Almeida, Pavneet Singh Kochhar, Lingfeng Bao
We found that our newly identified code smells are prevalent and impactful on the maintenance of DL systems from the developer's perspective.
no code implementations • 14 Nov 2021 • Kien Luong, Mohammad Hadi, Ferdian Thung, Fatemeh Fard, David Lo
Leveraging this observation, we develop FACOS, a context-specific algorithm to capture the semantic and syntactic information of the paragraphs and code snippets in a discussion.
no code implementations • 22 Feb 2021 • Zhiyuan Wan, Xin Xia, David Lo, Jiachi Chen, Xiapu Luo, Xiaohu Yang
Given numerous research efforts in addressing the security issues of smart contracts, we wondered how software practitioners build security into smart contracts in practice.
Software Engineering
1 code implementation • 14 Dec 2020 • Stefanus Agus Haryono, Ferdian Thung, David Lo, Lingxiao Jiang, Julia Lawall, Hong Jin Kang, Lucas Serrano, Gilles Muller
Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS.
Software Engineering
no code implementations • 10 Nov 2020 • Stefanus Agus Haryono, Ferdian Thung, David Lo, Julia Lawall, Lingxiao Jiang
In this paper, we built a tool to automate these updates.
Software Engineering
1 code implementation • 6 Oct 2020 • Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan, Shanping Li
Hence, in this study, we perform an empirical study on academic AI repositories to highlight good software engineering practices of popular academic AI repositories for AI researchers.
Software Engineering
no code implementations • 23 Aug 2020 • Cuiyun Gao, Jichuan Zeng, Zhiyuan Wen, David Lo, Xin Xia, Irwin King, Michael R. Lyu
Experiments on popular apps from Google Play and Apple's App Store demonstrate the effectiveness of MERIT in identifying emerging app issues, improving the state-of-the-art method by 22. 3% in terms of F1-score.
no code implementations • 25 Jun 2020 • Chao Liu, Cuiyun Gao, Xin Xia, David Lo, John Grundy, Xiaohu Yang
Experimental results show the importance of replicability and reproducibility, where the reported performance of a DL model could not be replicated for an unstable optimization process.
no code implementations • 29 May 2020 • Chao Liu, Xin Xia, David Lo, Zhiwei Liu, Ahmed E. Hassan, Shanping Li
CodeMatcher first collects metadata for query words to identify irrelevant/noisy ones, then iteratively performs fuzzy search with important query words on the codebase that is indexed by the Elasticsearch tool, and finally reranks a set of returned candidate code according to how the tokens in the candidate code snippet sequentially matched the important words in a query.
1 code implementation • 20 May 2020 • Zhipeng Gao, Xin Xia, John Grundy, David Lo, Yuan-Fang Li
Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet.
Software Engineering
no code implementations • 11 May 2020 • Roy Ka-Wei Lee, Thong Hoang, Richard J. Oentaryo, David Lo
The Act step then recommends to the user which activities to perform on the identified set of items.
1 code implementation • 10 Feb 2020 • Cuiyun Gao, Jichuan Zeng, Xin Xia, David Lo, Michael R. Lyu, Irwin King
Previous studies showed that replying to a user review usually has a positive effect on the rating that is given by the user to the app.
1 code implementation • 20 Jan 2020 • Zhipeng Gao, Lingxiao Jiang, Xin Xia, David Lo, John Grundy
However, many bugs and vulnerabilities have been identified in many contracts which raises serious concerns about smart contract security, not to mention that the blockchain systems on which the smart contracts are built can be buggy.
Software Engineering
1 code implementation • 12 Dec 2019 • Xiao Liang Yu, Omar Al-Bataineh, David Lo, Abhik Roychoudhury
Our approach can be used to optimise the overall security and reliability of smart contracts against malicious attackers.
Software Engineering Cryptography and Security 68N15 D.1.2
no code implementations • 27 Oct 2019 • Vinoj Jayasundara, Nghi Duy Quoc Bui, Lingxiao Jiang, David Lo
Program comprehension is a fundamental task in software development and maintenance processes.
1 code implementation • 16 Sep 2019 • Zhongxin Liu, Xin Xia, Christoph Treude, David Lo, Shanping Li
We build a dataset with over 41K PRs and evaluate our approach on this dataset through ROUGE and a human evaluation.
Software Engineering
1 code implementation • 22 Aug 2019 • Zhipeng Gao, Vinoj Jayasundara, Lingxiao Jiang, Xin Xia, David Lo, John Grundy
In addition to the uses by individual developers, SmartEmbed can also be applied to studies of smart contracts in a large scale.
Software Engineering
no code implementations • 3 May 2019 • Amirreza Shirani, Bowen Xu, David Lo, Thamar Solorio, Amin Alipour
The proposed dataset Stack Overflow is a useful resource to develop novel solutions, specifically data-hungry neural network models, for the prediction of relatedness in technical community question-answering forums.
1 code implementation • 16 Feb 2019 • Thong Hoang, Julia Lawall, Richard J. Oentaryo, Yuan Tian, David Lo
This work proposes PatchNet, an automated tool based on hierarchical deep learning for classifying patches by extracting features from commit messages and code changes.
no code implementations • 27 Feb 2018 • Thong Hoang, Richard J. Oentaryo, Tien-Duy B. Le, David Lo
To help the developers debug, numerous information retrieval (IR)-based and spectrum-based bug localization techniques have been devised.
no code implementations • 1 May 2017 • Ferdian Thung, Richard J. Oentaryo, David Lo, Yuan Tian
In this light, we propose a new, automated approach called WebAPIRec that takes as input a project profile and outputs a ranked list of {web} APIs that can be used to implement the project.
no code implementations • 24 Jun 2016 • Richard J. Oentaryo, Ee-Peng Lim, Freddy Chong Tat Chua, Jia-Wei Low, David Lo
The abundance of user-generated data in social media has incentivized the development of methods to infer the latent attributes of users, which are crucially useful for personalization, advertising and recommendation.
no code implementations • 10 Jun 2016 • Daoyuan Li, Li Li, Dongsun Kim, Tegawendé F. Bissyandé, David Lo, Yves Le Traon
One single code change can significantly influence a wide range of software systems and their users.
Software Engineering
no code implementations • 27th IEEE/ACM International Conference on Automated Software Engineering 2013 • Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N. Nguyen, David Lo, Chengnian Sun
Detecting duplicate bug reports helps reduce triaging efforts and save time for developers in fixing the same issues.