Search Results for author: Tim Menzies

Found 47 papers, 25 papers with code

SMOOTHIE: A Theory of Hyper-parameter Optimization for Software Analytics

1 code implementation • 17 Jan 2024 • Rahul Yedida, Tim Menzies

We hence conclude that this theory (that hyper-parameter optimization is best viewed as a ``smoothing'' function for the decision landscape), is both theoretically interesting and practically very useful.

Paper
Code

Mining Temporal Attack Patterns from Cyberthreat Intelligence Reports

no code implementations • 3 Jan 2024 • Md Rayhanur Rahman, Brandon Wroblewski, Quinn Matthews, Brantley Morgan, Tim Menzies, Laurie Williams

The goal of this paper is to aid security practitioners in prioritizing and proactive defense against cyberattacks by mining temporal attack patterns from cyberthreat intelligence reports.

Paper
Add Code

Less, but Stronger: On the Value of Strong Heuristics in Semi-supervised Learning for Software Analytics

1 code implementation • 3 Feb 2023 • Huy Tu, Tim Menzies

Standard SSL algorithms use ``weak'' knowledge (i. e. those not based on specific SE knowledge) such as (e. g.) co-train two learners and use good labels from one to train the other.

Decision Making

Paper
Code

Don't Lie to Me: Avoiding Malicious Explanations with STEALTH

no code implementations • 25 Jan 2023 • Lauren Alvarez, Tim Menzies

STEALTH is a method for using some AI-generated model, without suffering from malicious attacks (i. e. lying) or associated unfairness issues.

Clustering

Paper
Add Code

Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project Health

1 code implementation • 16 Jan 2023 • Andre Lustosa, Tim Menzies

For example, for project health indicators such as $C$= number of commits; $I$=number of closed issues, and $R$=number of closed pull requests, niSNEAK's 12 month prediction errors are \{I=0\%, R=33\%\, C=47\%\} Based on the above, we recommend landscape analytics (e. g. niSNEAK) especially when learning from very small data sets.

Hyperparameter Optimization

Paper
Code

A Tale of Two Cities: Data and Configuration Variances in Robust Deep Learning

no code implementations • 18 Nov 2022 • Guanqin Zhang, Jiankun Sun, Feng Xu, H. M. N. Dilum Bandara, Shiping Chen, Yulei Sui, Tim Menzies

Deep neural networks (DNNs), are widely used in many industries such as image recognition, supply chain, medical diagnosis, and autonomous driving.

Autonomous Driving Medical Diagnosis

Paper
Add Code

When Less is More: On the Value of "Co-training" for Semi-Supervised Software Defect Predictors

2 code implementations • 10 Nov 2022 • Suvodeep Majumder, Joymallya Chakraborty, Tim Menzies

Hence, there are often limits on how much-labeled data is available for training.

Open-Ended Question Answering

Paper
Code

How to Find Actionable Static Analysis Warnings: A Case Study with FindBugs

1 code implementation • 21 May 2022 • Rahul Yedida, Hong Jin Kang, Huy Tu, Xueqi Yang, David Lo, Tim Menzies

Automatically generated static code warnings suffer from a large number of false alarms.

Paper
Code

Dazzle: Using Optimized Generative Adversarial Networks to Address Security Data Class Imbalance Issue

no code implementations • 22 Mar 2022 • Rui Shu, Tianpei Xia, Laurie Williams, Tim Menzies

Conclusion: Based on this study, we would suggest the use of optimized GANs as an alternative method for security vulnerability data class imbalanced issues.

Bayesian Optimization

Paper
Add Code

DebtFree: Minimizing Labeling Cost in Self-Admitted Technical Debt Identification using Semi-Supervised Learning

no code implementations • 25 Jan 2022 • Huy Tu, Tim Menzies

The human experts are then required to read almost a quintuple of the SATD comments which indicates the inefficiency of the tool.

Active Learning Pseudo Label

Paper
Add Code

Fair Enough: Searching for Sufficient Measures of Fairness

1 code implementation • 25 Oct 2021 • Suvodeep Majumder, Joymallya Chakraborty, Gina R. Bai, Kathryn T. Stolee, Tim Menzies

In summary, to simplify the fairness testing problem, we recommend the following steps: (1)~determine what type of fairness is desirable (and we offer a handful of such types); then (2) lookup those types in our clusters; then (3) just test for one item per cluster.

Fairness

Paper
Code

FairMask: Better Fairness via Model-based Rebalancing of Protected Attributes

no code implementations • 3 Oct 2021 • Kewen Peng, Joymallya Chakraborty, Tim Menzies

Our approach aims to offset the biased predictions of the classification model via rebalancing the distribution of protected attributes.

Fairness

Paper
Add Code

An Expert System for Redesigning Software for Cloud Applications

no code implementations • 29 Sep 2021 • Rahul Yedida, Rahul Krishna, Anup Kalia, Tim Menzies, Jin Xiao, Maja Vukovic

When services are divided into many independent components, they are easier to update.

Paper
Add Code

FRUGAL: Unlocking SSL for Software Analytics

1 code implementation • 22 Aug 2021 • Huy Tu, Tim Menzies

However, prior work has shown that such requirements can be expensive, taking several weeks to label thousands of commits, and not always available when traversing new research problems and domains.

Paper
Code

FairBalance: How to Achieve Equalized Odds With Data Pre-processing

1 code implementation • 17 Jul 2021 • Zhe Yu, Joymallya Chakraborty, Tim Menzies

We found that equalizing the class distribution in each demographic group with sample weights is a necessary condition for achieving equalized odds without modifying the normal training process.

BIG-bench Machine Learning Fairness

Paper
Code

Fairer Software Made Easier (using "Keys")

no code implementations • 11 Jul 2021 • Tim Menzies, Kewen Peng, Andre Lustosa

Can we simplify explanations for software analytics?

Paper
Add Code

Bias in Machine Learning Software: Why? How? What to do?

2 code implementations • 25 May 2021 • Joymallya Chakraborty, Suvodeep Majumder, Tim Menzies

This paper postulates that the root causes of bias are the prior decisions that affect- (a) what data was selected and (b) the labels assigned to those examples.

Attribute BIG-bench Machine Learning +1

Paper
Code

Assessing the Early Bird Heuristic (for Predicting Project Quality)

2 code implementations • 24 May 2021 • N. C. Shrikanth, Tim Menzies

Moreover, using this early bird method, we have shown that a simple model (with just a few features) generalizes to hundreds of projects.

Paper
Code

Old but Gold: Reconsidering the value of feedforward learners for software analytics

1 code implementation • 15 Jan 2021 • Rahul Yedida, Xueqi Yang, Tim Menzies

We test the hypothesis laid by Galke and Scherp [18], that feedforward networks suffice for many analytics tasks (which we call, the "Old but Gold" hypothesis) for these two tasks.

Vulnerability Detection

Paper
Code

Early Life Cycle Software Defect Prediction. Why? How?

1 code implementation • 26 Nov 2020 • N. C. Shrikanth, Suvodeep Majumder, Tim Menzies

Hence, defect predictors learned from the first 150 commits and four months perform just as well as anything else.

Transfer Learning

Paper
Code

Omni: Automated Ensemble with Unexpected Models against Adversarial Evasion Attack

no code implementations • 23 Nov 2020 • Rui Shu, Tianpei Xia, Laurie Williams, Tim Menzies

Conclusion: When employing ensemble defense against adversarial evasion attacks, we suggest creating an ensemble with unexpected models that are distant from the attacker's expected model (i. e., target model) through methods such as hyperparameter optimization.

BIG-bench Machine Learning Ensemble Learning +2

Paper
Add Code

Empirical Standards for Software Engineering Research

1 code implementation • 7 Oct 2020 • Paul Ralph, Nauman bin Ali, Sebastian Baltes, Domenico Bianculli, Jessica Diaz, Yvonne Dittrich, Neil Ernst, Michael Felderer, Robert Feldt, Antonio Filieri, Breno Bernard Nicolau de França, Carlo Alberto Furia, Greg Gay, Nicolas Gold, Daniel Graziotin, Pinjia He, Rashina Hoda, Natalia Juristo, Barbara Kitchenham, Valentina Lenarduzzi, Jorge Martínez, Jorge Melegati, Daniel Mendez, Tim Menzies, Jefferson Molleri, Dietmar Pfahl, Romain Robbes, Daniel Russo, Nyyti Saarimäki, Federica Sarro, Janet Siegmund, Diomidis Spinellis, Miroslaw Staron, Klaas Stol, Margaret-Anne Storey, Davide Taibi, Damian Tamburri, Marco Torchiano, Christoph Treude, Burak Turhan, XiaoFeng Wang, Sira Vegas

Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e. g. a questionnaire survey).

Software Engineering General Literature

277

Paper
Code

Revisiting Process versus Product Metrics: a Large Scale Analysis

no code implementations • 21 Aug 2020 • Suvodeep Majumder, Pranav Mody, Tim Menzies

We find that some analytics in-the-small conclusions still hold when scaling up to analytics in-the-large.

Paper
Add Code

How Different is Test Case Prioritization for Open and Closed Source Projects?

no code implementations • 3 Aug 2020 • Xiao Ling, Rishabh Agrawal, Tim Menzies

Improved test case prioritization means that software developers can detect and fix more software faults sooner than usual.

Software Engineering

Paper
Add Code

Identifying Self-Admitted Technical Debts with Jitterbug: A Two-step Approach

2 code implementations • 25 Feb 2020 • Zhe Yu, Fahmid Morshed Fahid, Huy Tu, Tim Menzies

Keeping track of and managing the self-admitted technical debts (SATDs) is important to maintaining a healthy software project.

Software Engineering

Paper
Code

Simpler Hyperparameter Optimization for Software Analytics: Why, How, When?

no code implementations • 9 Dec 2019 • Amritanshu Agrawal, Xueqi Yang, Rishabh Agrawal, Rahul Yedida, Xipeng Shen, Tim Menzies

How can we make software analytics simpler and faster?

Hyperparameter Optimization

Paper
Add Code

Methods for Stabilizing Models across Large Samples of Projects (with case studies on Predicting Defect and Project Health)

1 code implementation • 6 Nov 2019 • Suvodeep Majumder, Tianpei Xia, Rahul Krishna, Tim Menzies

To the best of our knowledge, STABILIZER is order of magnitude faster than the prior state-of-the-art transfer learners which seek to find conclusion stability, and these case studies are the largest demonstration of the generalizability of quantitative predictions of project quality yet reported in the SE literature.

Transfer Learning

Paper
Code

How to Better Distinguish Security Bug Reports (using Dual Hyperparameter Optimization

no code implementations • 4 Nov 2019 • Rui Shu, Tianpei Xia, Jianfeng Chen, Laurie Williams, Tim Menzies

For example, in a study of security bug reports from the Chromium dataset, the median recalls of FARSEC and Swift were 15. 7% and 77. 4%, respectively.

Software Engineering

Paper
Add Code

Whence to Learn? Transferring Knowledge in Configurable Systems using BEETLE

2 code implementations • 1 Nov 2019 • Rahul Krishna, Vivek Nair, Pooyan Jamshidi, Tim Menzies

To resolve these problems, we propose a novel transfer learning framework called BEETLE, which is a "bellwether"-based transfer learner that focuses on identifying and learning from the most relevant source from amongst the old data.

Software Engineering

Paper
Code

Better Technical Debt Detection via SURVEYing

1 code implementation • 20 May 2019 • Fahmid M. Fahid, Zhe Yu, Tim Menzies

Specifically, for ten open-source JAVA projects, we can find 83% of the technical debt via SURVEY0 using just 16% of the comments (and if higher levels of recall are required, SURVEY0can adjust towards that with some additional effort).

Paper
Code

TERMINATOR: Better Automated UI Test Case Prioritization

2 code implementations • 16 May 2019 • Zhe Yu, Fahmid M. Fahid, Tim Menzies, Gregg Rothermel, Kyle Patrick, Snehit Cherian

Given that much of the automated UI testing is "black box" in nature, very little information (only the test case descriptions and testing results) can be utilized to prioritize these automated UI test cases.

Paper
Code

Software Engineering for Fairness: A Case Study with Hyperparameter Optimization

no code implementations • 14 May 2019 • Joymallya Chakraborty, Tianpei Xia, Fahmid M. Fahid, Tim Menzies

To the best of our knowledge, this is the first application of hyperparameter optimization as a tool for software engineers to generate fairer software.

BIG-bench Machine Learning Fairness +1

Paper
Add Code

How to "DODGE" Complex Software Analytics?

no code implementations • 5 Feb 2019 • Amritanshu Agrawal, Wei Fu, Di Chen, Xipeng Shen, Tim Menzies

Machine learning techniques applied to software engineering tasks can be improved by hyperparameter optimization, i. e., automatic tools that find good settings for a learner's control parameters.

BIG-bench Machine Learning Hyperparameter Optimization

Paper
Add Code

Hyperparameter Optimization for Effort Estimation

1 code implementation • 28 Apr 2018 • Tianpei Xia, Rahul Krishna, Jianfeng Chen, George Mathew, Xipeng Shen, Tim Menzies

We test OIL on a wide range of hyperparameter optimizers using data from 945 software projects.

Software Engineering

Paper
Code

Transfer Learning with Bellwethers to find Good Configurations

3 code implementations • 11 Mar 2018 • Vivek Nair, Rahul Krishna, Tim Menzies, Pooyan Jamshidi

Using this insight, this paper proposes BEETLE, a novel bellwether based transfer learning scheme, which can identify a suitable source and use it to find near-optimal configurations of a software system.

Software Engineering

Paper
Code

500+ Times Faster Than Deep Learning (A Case Study Exploring Faster Methods for Text Mining StackOverflow)

no code implementations • 14 Feb 2018 • Suvodeep Majumder, Nikhila Balaji, Katie Brey, Wei Fu, Tim Menzies

Deep learners utilizes extensive computational power and can take a long time to train-- making it difficult to widely validate and repeat and improve their results.

Clustering

Paper
Add Code

Finding Faster Configurations using FLASH

1 code implementation • 7 Jan 2018 • Vivek Nair, Zhe Yu, Tim Menzies, Norbert Siegmund, Sven Apel

FLASH scales up to software systems that defeat the prior state of the art model-based methods in this area.

Software Engineering

Paper
Code

RIOT: a Stochastic-based Method for Workflow Scheduling in the Cloud

no code implementations • 27 Aug 2017 • Jianfeng Chen, Tim Menzies

Cloud computing provides engineers or scientists a place to run complex computing tasks.

Cloud Computing Scheduling

Paper
Add Code

Learning Effective Changes For Software Projects

1 code implementation • 17 Aug 2017 • Rahul Krishna, Tim Menzies

The current generation of software analytics tools are mostly prediction algorithms (e. g. support vector machines, naive bayes, logistic regression, etc).

Software Engineering

Paper
Code

Easy over Hard: A Case Study on Deep Learning

1 code implementation • 1 Mar 2017 • Wei Fu, Tim Menzies

While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost.

Paper
Code

Revisiting Unsupervised Learning for Defect Prediction

1 code implementation • 1 Mar 2017 • Wei Fu, Tim Menzies

(1) There is much variability in the efficacy of the Yang et al. predictors so even with their approach, some supervised data is required to prune weaker predictors away.

Paper
Code

Beyond Evolutionary Algorithms for Search-based Software Engineering

no code implementations • 27 Jan 2017 • Jianfeng Chen, Vivek Nair, Tim Menzies

Context: Evolutionary algorithms typically require a large number of evaluations (of solutions) to converge - which can be very slow and expensive to evaluate. Objective: To solve search-based software engineering (SE) problems, using fewer evaluations than evolutionary methods. Method: Instead of mutating a small population, we build a very large initial population which is then culled using a recursive bi-clustering chop approach.

Clustering Evolutionary Algorithms

Paper
Add Code

Faster Discovery of Faster System Configurations with Spectral Learning

no code implementations • 27 Jan 2017 • Vivek Nair, Tim Menzies, Norbert Siegmund, Sven Apel

Despite the huge spread and economical importance of configurable software systems, there is unsatisfactory support in utilizing the full potential of these systems with respect to finding performance-optimal configurations.

Dimensionality Reduction

Paper
Add Code

Finding Better Active Learners for Faster Literature Reviews

1 code implementation • 10 Dec 2016 • Zhe Yu, Nicholas A. Kraft, Tim Menzies

Literature reviews can be time-consuming and tedious to complete.

Active Learning

Paper
Code

Why is Differential Evolution Better than Grid Search for Tuning Defect Predictors?

no code implementations • 8 Sep 2016 • Wei Fu, Vivek Nair, Tim Menzies

In software analytics, at least for defect prediction, several methods, like grid search and differential evolution (DE), have been proposed to learn these parameters, which has been proved to be able to improve the performance scores of learners.

Paper
Add Code

A deep learning model for estimating story points

no code implementations • 2 Sep 2016 • Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, Trang Pham, Aditya Ghose, Tim Menzies

Although there has been substantial research in software analytics for effort estimation in traditional software projects, little work has been done for estimation in agile projects, especially estimating user stories or issues.

Feature Engineering

Paper
Add Code

What is Wrong with Topic Modeling? (and How to Fix it Using Search-based Software Engineering)

no code implementations • 29 Aug 2016 • Amritanshu Agrawal, Wei Fu, Tim Menzies

When run on different datasets, LDA suffers from "order effects" i. e. different topics are generated if the order of training data is shuffled.

General Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.