Search Results for author: David Lo

Found 48 papers, 20 papers with code

AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation

no code implementations • 25 Apr 2024 • Zhensu Sun, Xiaoning Du, Zhou Yang, Li Li, David Lo

Particularly, abundant grammar tokens and formatting tokens are included to make the code more readable to humans.

Code Generation Math

Paper
Add Code

Bridging Expert Knowledge with Deep Learning Techniques for Just-In-Time Defect Prediction

no code implementations • 17 Mar 2024 • Xin Zhou, DongGyun Han, David Lo

In addition, our experimental results confirm that the simple model and complex model are complementary to each other.

Paper
Add Code

A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research

no code implementations • 26 Jan 2024 • Sicong Cao, Xiaobing Sun, Ratnadira Widyasari, David Lo, Xiaoxue Wu, Lili Bo, Jiale Zhang, Bin Li, Wei Liu, Di wu, Yixin Chen

The remarkable achievements of Artificial Intelligence (AI) algorithms, particularly in Machine Learning (ML) and Deep Learning (DL), have fueled their extensive deployment across multiple sectors, including Software Engineering (SE).

Decision Making Vulnerability Detection

Paper
Add Code

Inferring Properties of Graph Neural Networks

no code implementations • 8 Jan 2024 • Dat Nguyen, Hieu M. Vu, Cong-Thanh Le, Bach Le, David Lo, ThanhVu Nguyen, Corina Pasareanu

To tackle the challenge of varying input structures in GNNs, GNNInfer first identifies a set of representative influential structures that contribute significantly towards the prediction of a GNN.

Backdoor Attack

Paper
Add Code

Assessing AI Detectors in Identifying AI-Generated Code: Implications for Education

no code implementations • 8 Jan 2024 • Wei Hung Pan, Ming Jie Chok, Jonathan Leong Shan Wong, Yung Xin Shin, Yeong Shian Poon, Zhou Yang, Chun Yong Chong, David Lo, Mei Kuan Lim

This is achieved by generating code in response to a given question using different variants.

Paper
Add Code

Trustworthy and Synergistic Artificial Intelligence for Software Engineering: Vision and Roadmaps

no code implementations • 8 Sep 2023 • David Lo

For decades, much software engineering research has been dedicated to devising automated solutions aimed at enhancing developer productivity and elevating software quality.

Paper
Add Code

Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models

1 code implementation • 21 Aug 2023 • Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, Houari Sahraoui

In this paper, we deliver a comprehensive study of PEFT techniques for LLMs under the automated code generation scenario.

Code Generation In-Context Learning +1

Paper
Code

Large Language Models for Software Engineering: A Systematic Literature Review

1 code implementation • 21 Aug 2023 • Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, Haoyu Wang

Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages.

Paper
Code

Source Code Data Augmentation for Deep Learning: A Survey

1 code implementation • 31 May 2023 • Terry Yue Zhuo, Zhou Yang, Zhensu Sun, YuFei Wang, Li Li, Xiaoning Du, Zhenchang Xing, David Lo

This paper fills this gap by conducting a comprehensive and integrative survey of data augmentation for source code, wherein we systematically compile and encapsulate existing literature to provide a comprehensive overview of the field.

Data Augmentation

Paper
Code

Multi-Granularity Detector for Vulnerability Fixes

1 code implementation • 23 May 2023 • Truong Giang Nguyen, Thanh Le-Cong, Hong Jin Kang, Ratnadira Widyasari, Chengran Yang, Zhipeng Zhao, Bowen Xu, Jiayuan Zhou, Xin Xia, Ahmed E. Hassan, Xuan-Bach D. Le, David Lo

To address these challenges and boost the effectiveness of prior works, we propose MiDas (Multi-Granularity Detector for Vulnerability Fixes).

Paper
Code

On the Usage of Continual Learning for Out-of-Distribution Generalization in Pre-trained Language Models of Code

no code implementations • 6 May 2023 • Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, Houari Sahraoui

We demonstrate that the most commonly used fine-tuning technique from prior work is not robust enough to handle the dynamic nature of APIs, leading to the loss of previously acquired knowledge i. e., catastrophic forgetting.

Continual Learning General Knowledge +1

Paper
Add Code

A Study of Variable-Role-based Feature Enrichment in Neural Models of Code

no code implementations • 8 Mar 2023 • Aftab Hussain, Md Rafiqul Islam Rabin, Bowen Xu, David Lo, Mohammad Amin Alipour

In this paper, we explore the impact of an unsuperivsed feature enrichment approach based on variable roles on the performance of neural models of code.

Feature Engineering

Paper
Add Code

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

no code implementations • 16 Feb 2023 • Zichong Wang, Yang Zhou, Meikang Qiu, Israat Haque, Laura Brown, Yi He, Jianwu Wang, David Lo, Wenbin Zhang

The increasing use of Machine Learning (ML) software can lead to unfair and unethical decisions, thus fairness bugs in software are becoming a growing concern.

Benchmarking counterfactual +1

Paper
Add Code

Regret-Based Defense in Adversarial Reinforcement Learning

no code implementations • 14 Feb 2023 • Roman Belaire, Pradeep Varakantham, Thanh Nguyen, David Lo

We demonstrate that our approaches provide a significant improvement in performance across a wide variety of benchmarks against leading approaches for robust Deep RL.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems

1 code implementation • 11 Feb 2023 • Daniel Hao Xian Yuen, Andrew Yong Chen Pang, Zhou Yang, Chun Yong Chong, Mei Kuan Lim, David Lo

To address these limitations, our tool incorporates two novel features: (1) a text transformation module to boost the number of generated test cases and uncover more errors in ASR systems and (2) a phonetic analysis module to identify on which phonemes the ASR system tend to produce errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic Reasoning

1 code implementation • 3 Jan 2023 • Thanh Le-Cong, Duc-Minh Luong, Xuan Bach D. Le, David Lo, Nhat-Hoa Tran, Bui Quang-Huy, Quyet-Thang Huynh

In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax.

Language Modelling Program Repair

Paper
Code

DexBERT: Effective, Task-Agnostic and Fine-grained Representation Learning of Android Bytecode

1 code implementation • 12 Dec 2022 • Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyandé, Jacques Klein

Central to applying ML to software artifacts (like source or executable code) is converting them into forms suitable for learning.

Language Modelling Representation Learning

Paper
Code

BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets

1 code implementation • 7 Oct 2022 • Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Kecen Li, Arunesh Sinha, Bowen Xu, Xinwen Hou, David Lo, Tianhao Wang

Our experiments conducted on four tasks and four offline RL algorithms expose a disquieting fact: none of the existing offline RL algorithms is immune to such a backdoor attack.

Autonomous Driving Backdoor Attack +3

Paper
Code

VulCurator: A Vulnerability-Fixing Commit Detector

1 code implementation • 7 Sep 2022 • Truong Giang Nguyen, Thanh Le-Cong, Hong Jin Kang, Xuan-Bach D. Le, David Lo

Open-source software (OSS) vulnerability management process is important nowadays, as the number of discovered OSS vulnerabilities is increasing over time.

Management

Paper
Code

AutoPruner: Transformer-Based Call Graph Pruning

1 code implementation • 7 Sep 2022 • Thanh Le-Cong, Hong Jin Kang, Truong Giang Nguyen, Stefanus Agus Haryono, David Lo, Xuan-Bach D. Le, Huynh Quyet Thang

Given a call graph constructed by traditional static analysis tools, AutoPruner takes a Transformer-based approach to capture the semantic relationships between the caller and callee functions associated with each edge in the call graph.

Paper
Code

How to Find Actionable Static Analysis Warnings: A Case Study with FindBugs

1 code implementation • 21 May 2022 • Rahul Yedida, Hong Jin Kang, Huy Tu, Xueqi Yang, David Lo, Tim Menzies

Automatically generated static code warnings suffer from a large number of false alarms.

Paper
Code

On the Effectiveness of Pretrained Models for API Learning

no code implementations • 5 Apr 2022 • Mohammad Abdul Hadi, Imam Nur Bani Yusuf, Ferdian Thung, Kien Gia Luong, Jiang Lingxiao, Fatemeh H. Fard, David Lo

We have also identified two different tokenization approaches that can contribute to a significant boost in PTMs' performance for the API sequence generation task.

Information Retrieval Language Modelling +2

Paper
Add Code

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages

no code implementations • 5 Apr 2022 • Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin

Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i. e., Ruby and Java code possess very different structure.

Code Search Code Summarization

Paper
Add Code

An Exploratory Study on Code Attention in BERT

no code implementations • 5 Apr 2022 • Rishab Sharma, Fuxiang Chen, Fatemeh Fard, David Lo

When identifiers' embeddings are used in CodeBERT, a code-based PLM, the performance is improved by 21-24% in the F1-score of clone detection.

Clone Detection Code Summarization

Paper
Add Code

Code Smells in Machine Learning Systems

no code implementations • 2 Mar 2022 • Jiri Gesi, SiQi Liu, Jiawei Li, Iftekhar Ahmed, Nachiappan Nagappan, David Lo, Eduardo Santana de Almeida, Pavneet Singh Kochhar, Lingfeng Bao

We found that our newly identified code smells are prevalent and impactful on the maintenance of DL systems from the developer's perspective.

BIG-bench Machine Learning

Paper
Add Code

FACOS: Finding API Relevant Contents on Stack Overflow with Semantic and Syntactic Analysis

no code implementations • 14 Nov 2021 • Kien Luong, Mohammad Hadi, Ferdian Thung, Fatemeh Fard, David Lo

Leveraging this observation, we develop FACOS, a context-specific algorithm to capture the semantic and syntactic information of the paragraphs and code snippets in a discussion.

Paper
Add Code

Smart Contract Security: a Practitioners' Perspective

no code implementations • 22 Feb 2021 • Zhiyuan Wan, Xin Xia, David Lo, Jiachi Chen, Xiapu Luo, Xiaohu Yang

Given numerous research efforts in addressing the security issues of smart contracts, we wondered how software practitioners build security into smart contracts in practice.

Software Engineering

Paper
Add Code

AndroEvolve: Automated Update for Android Deprecated-API Usages

1 code implementation • 14 Dec 2020 • Stefanus Agus Haryono, Ferdian Thung, David Lo, Lingxiao Jiang, Julia Lawall, Hong Jin Kang, Lucas Serrano, Gilles Muller

Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS.

Software Engineering

Paper
Code

Characterization and Automatic Update of Deprecated Machine-Learning API Usages

no code implementations • 10 Nov 2020 • Stefanus Agus Haryono, Ferdian Thung, David Lo, Julia Lawall, Lingxiao Jiang

In this paper, we built a tool to automate these updates.

Software Engineering

Paper
Add Code

What Makes a Popular Academic AI Repository?

1 code implementation • 6 Oct 2020 • Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan, Shanping Li

Hence, in this study, we perform an empirical study on academic AI repositories to highlight good software engineering practices of popular academic AI repositories for AI researchers.

Software Engineering

Paper
Code

Emerging App Issue Identification via Online Joint Sentiment-Topic Tracing

no code implementations • 23 Aug 2020 • Cuiyun Gao, Jichuan Zeng, Zhiyuan Wen, David Lo, Xin Xia, Irwin King, Michael R. Lyu

Experiments on popular apps from Google Play and Apple's App Store demonstrate the effectiveness of MERIT in identifying emerging app issues, improving the state-of-the-art method by 22. 3% in terms of F1-score.

Clustering

Paper
Add Code

On the Replicability and Reproducibility of Deep Learning in Software Engineering

no code implementations • 25 Jun 2020 • Chao Liu, Cuiyun Gao, Xin Xia, David Lo, John Grundy, Xiaohu Yang

Experimental results show the importance of replicability and reproducibility, where the reported performance of a DL model could not be replicated for an unstable optimization process.

Feature Engineering

Paper
Add Code

CodeMatcher: Searching Code Based on Sequential Semantics of Important Query Words

no code implementations • 29 May 2020 • Chao Liu, Xin Xia, David Lo, Zhiwei Liu, Ahmed E. Hassan, Shanping Li

CodeMatcher first collects metadata for query words to identify irrelevant/noisy ones, then iteratively performs fuzzy search with important query words on the codebase that is indexed by the Elasticsearch tool, and finally reranks a set of returned candidate code according to how the tokens in the candidate code snippet sequentially matched the important words in a query.

Code Search Information Retrieval +1

Paper
Add Code

Generating Question Titles for Stack Overflow from Mined Code Snippets

1 code implementation • 20 May 2020 • Zhipeng Gao, Xin Xia, John Grundy, David Lo, Yuan-Fang Li

Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet.

Software Engineering

Paper
Code

Keen2Act: Activity Recommendation in Online Social Collaborative Platforms

no code implementations • 11 May 2020 • Roy Ka-Wei Lee, Thong Hoang, Richard J. Oentaryo, David Lo

The Act step then recommends to the user which activities to perform on the identified set of items.

Recommendation Systems

Paper
Add Code

Automating App Review Response Generation

1 code implementation • 10 Feb 2020 • Cuiyun Gao, Jichuan Zeng, Xin Xia, David Lo, Michael R. Lyu, Irwin King

Previous studies showed that replying to a user review usually has a positive effect on the rating that is given by the user to the app.

Response Generation

Paper
Code

Checking Smart Contracts with Structural Code Embedding

1 code implementation • 20 Jan 2020 • Zhipeng Gao, Lingxiao Jiang, Xin Xia, David Lo, John Grundy

However, many bugs and vulnerabilities have been identified in many contracts which raises serious concerns about smart contract security, not to mention that the blockchain systems on which the smart contracts are built can be buggy.

Software Engineering

Paper
Code

Smart Contract Repair

1 code implementation • 12 Dec 2019 • Xiao Liang Yu, Omar Al-Bataineh, David Lo, Abhik Roychoudhury

Our approach can be used to optimise the overall security and reliability of smart contracts against malicious attackers.

Software Engineering Cryptography and Security 68N15 D.1.2

Paper
Code

TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

no code implementations • 27 Oct 2019 • Vinoj Jayasundara, Nghi Duy Quoc Bui, Lingxiao Jiang, David Lo

Program comprehension is a fundamental task in software development and maintenance processes.

Paper
Add Code

Automatic Generation of Pull Request Descriptions

1 code implementation • 16 Sep 2019 • Zhongxin Liu, Xin Xia, Christoph Treude, David Lo, Shanping Li

We build a dataset with over 41K PRs and evaluate our approach on this dataset through ROUGE and a human evaluation.

Software Engineering

Paper
Code

SmartEmbed: A Tool for Clone and Bug Detection in Smart Contracts through Structural Code Embedding

1 code implementation • 22 Aug 2019 • Zhipeng Gao, Vinoj Jayasundara, Lingxiao Jiang, Xin Xia, David Lo, John Grundy

In addition to the uses by individual developers, SmartEmbed can also be applied to studies of smart contracts in a large scale.

Software Engineering

Paper
Code

Question Relatedness on Stack Overflow: The Task, Dataset, and Corpus-inspired Models

no code implementations • 3 May 2019 • Amirreza Shirani, Bowen Xu, David Lo, Thamar Solorio, Amin Alipour

The proposed dataset Stack Overflow is a useful resource to develop novel solutions, specifically data-hungry neural network models, for the prediction of relatedness in technical community question-answering forums.

Community Question Answering Multi-class Classification

Paper
Add Code

PatchNet: A Tool for Deep Patch Classification

1 code implementation • 16 Feb 2019 • Thong Hoang, Julia Lawall, Richard J. Oentaryo, Yuan Tian, David Lo

This work proposes PatchNet, an automated tool based on hierarchical deep learning for classifying patches by extracting features from commit messages and code changes.

Classification General Classification

Paper
Code

Network-Clustered Multi-Modal Bug Localization

no code implementations • 27 Feb 2018 • Thong Hoang, Richard J. Oentaryo, Tien-Duy B. Le, David Lo

To help the developers debug, numerous information retrieval (IR)-based and spectrum-based bug localization techniques have been devised.

Clustering Information Retrieval +1

Paper
Add Code

WebAPIRec: Recommending Web APIs to Software Projects via Personalized Ranking

no code implementations • 1 May 2017 • Ferdian Thung, Richard J. Oentaryo, David Lo, Yuan Tian

In this light, we propose a new, automated approach called WebAPIRec that takes as input a project profile and outputs a ranked list of {web} APIs that can be used to implement the project.

Paper
Add Code

Collective Semi-Supervised Learning for User Profiling in Social Media

no code implementations • 24 Jun 2016 • Richard J. Oentaryo, Ee-Peng Lim, Freddy Chong Tat Chua, Jia-Wei Low, David Lo

The abundance of user-generated data in social media has incentivized the development of methods to infer the latent attributes of users, which are crucially useful for personalization, advertising and recommendation.

Paper
Add Code

Watch out for This Commit! A Study of Influential Software Changes

no code implementations • 10 Jun 2016 • Daoyuan Li, Li Li, Dongsun Kim, Tegawendé F. Bissyandé, David Lo, Yves Le Traon

One single code change can significantly influence a wide range of software systems and their users.

Software Engineering

Paper
Add Code

Duplicate Bug Report Detection With a Combination of Information Retrieval and Topic Modeling

no code implementations • 27th IEEE/ACM International Conference on Automated Software Engineering 2013 • Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N. Nguyen, David Lo, Chengnian Sun

Detecting duplicate bug reports helps reduce triaging efforts and save time for developers in fixing the same issues.

Descriptive Information Retrieval +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.