Search Results for author: Tim Menzies

Found 47 papers, 25 papers with code

SMOOTHIE: A Theory of Hyper-parameter Optimization for Software Analytics

1 code implementation17 Jan 2024 Rahul Yedida, Tim Menzies

We hence conclude that this theory (that hyper-parameter optimization is best viewed as a ``smoothing'' function for the decision landscape), is both theoretically interesting and practically very useful.

Mining Temporal Attack Patterns from Cyberthreat Intelligence Reports

no code implementations3 Jan 2024 Md Rayhanur Rahman, Brandon Wroblewski, Quinn Matthews, Brantley Morgan, Tim Menzies, Laurie Williams

The goal of this paper is to aid security practitioners in prioritizing and proactive defense against cyberattacks by mining temporal attack patterns from cyberthreat intelligence reports.

Less, but Stronger: On the Value of Strong Heuristics in Semi-supervised Learning for Software Analytics

1 code implementation3 Feb 2023 Huy Tu, Tim Menzies

Standard SSL algorithms use ``weak'' knowledge (i. e. those not based on specific SE knowledge) such as (e. g.) co-train two learners and use good labels from one to train the other.

Decision Making

Don't Lie to Me: Avoiding Malicious Explanations with STEALTH

no code implementations25 Jan 2023 Lauren Alvarez, Tim Menzies

STEALTH is a method for using some AI-generated model, without suffering from malicious attacks (i. e. lying) or associated unfairness issues.

Clustering

Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project Health

1 code implementation16 Jan 2023 Andre Lustosa, Tim Menzies

For example, for project health indicators such as $C$= number of commits; $I$=number of closed issues, and $R$=number of closed pull requests, niSNEAK's 12 month prediction errors are \{I=0\%, R=33\%\, C=47\%\} Based on the above, we recommend landscape analytics (e. g. niSNEAK) especially when learning from very small data sets.

Hyperparameter Optimization

A Tale of Two Cities: Data and Configuration Variances in Robust Deep Learning

no code implementations18 Nov 2022 Guanqin Zhang, Jiankun Sun, Feng Xu, H. M. N. Dilum Bandara, Shiping Chen, Yulei Sui, Tim Menzies

Deep neural networks (DNNs), are widely used in many industries such as image recognition, supply chain, medical diagnosis, and autonomous driving.

Autonomous Driving Medical Diagnosis

How to Find Actionable Static Analysis Warnings: A Case Study with FindBugs

1 code implementation21 May 2022 Rahul Yedida, Hong Jin Kang, Huy Tu, Xueqi Yang, David Lo, Tim Menzies

Automatically generated static code warnings suffer from a large number of false alarms.

Dazzle: Using Optimized Generative Adversarial Networks to Address Security Data Class Imbalance Issue

no code implementations22 Mar 2022 Rui Shu, Tianpei Xia, Laurie Williams, Tim Menzies

Conclusion: Based on this study, we would suggest the use of optimized GANs as an alternative method for security vulnerability data class imbalanced issues.

Bayesian Optimization

DebtFree: Minimizing Labeling Cost in Self-Admitted Technical Debt Identification using Semi-Supervised Learning

no code implementations25 Jan 2022 Huy Tu, Tim Menzies

The human experts are then required to read almost a quintuple of the SATD comments which indicates the inefficiency of the tool.

Active Learning Pseudo Label

Fair Enough: Searching for Sufficient Measures of Fairness

1 code implementation25 Oct 2021 Suvodeep Majumder, Joymallya Chakraborty, Gina R. Bai, Kathryn T. Stolee, Tim Menzies

In summary, to simplify the fairness testing problem, we recommend the following steps: (1)~determine what type of fairness is desirable (and we offer a handful of such types); then (2) lookup those types in our clusters; then (3) just test for one item per cluster.

Fairness

FairMask: Better Fairness via Model-based Rebalancing of Protected Attributes

no code implementations3 Oct 2021 Kewen Peng, Joymallya Chakraborty, Tim Menzies

Our approach aims to offset the biased predictions of the classification model via rebalancing the distribution of protected attributes.

Fairness

An Expert System for Redesigning Software for Cloud Applications

no code implementations29 Sep 2021 Rahul Yedida, Rahul Krishna, Anup Kalia, Tim Menzies, Jin Xiao, Maja Vukovic

When services are divided into many independent components, they are easier to update.

FRUGAL: Unlocking SSL for Software Analytics

1 code implementation22 Aug 2021 Huy Tu, Tim Menzies

However, prior work has shown that such requirements can be expensive, taking several weeks to label thousands of commits, and not always available when traversing new research problems and domains.

FairBalance: How to Achieve Equalized Odds With Data Pre-processing

1 code implementation17 Jul 2021 Zhe Yu, Joymallya Chakraborty, Tim Menzies

We found that equalizing the class distribution in each demographic group with sample weights is a necessary condition for achieving equalized odds without modifying the normal training process.

BIG-bench Machine Learning Fairness

Fairer Software Made Easier (using "Keys")

no code implementations11 Jul 2021 Tim Menzies, Kewen Peng, Andre Lustosa

Can we simplify explanations for software analytics?

Bias in Machine Learning Software: Why? How? What to do?

2 code implementations25 May 2021 Joymallya Chakraborty, Suvodeep Majumder, Tim Menzies

This paper postulates that the root causes of bias are the prior decisions that affect- (a) what data was selected and (b) the labels assigned to those examples.

Attribute BIG-bench Machine Learning +1

Assessing the Early Bird Heuristic (for Predicting Project Quality)

2 code implementations24 May 2021 N. C. Shrikanth, Tim Menzies

Moreover, using this early bird method, we have shown that a simple model (with just a few features) generalizes to hundreds of projects.

Old but Gold: Reconsidering the value of feedforward learners for software analytics

1 code implementation15 Jan 2021 Rahul Yedida, Xueqi Yang, Tim Menzies

We test the hypothesis laid by Galke and Scherp [18], that feedforward networks suffice for many analytics tasks (which we call, the "Old but Gold" hypothesis) for these two tasks.

Vulnerability Detection

Early Life Cycle Software Defect Prediction. Why? How?

1 code implementation26 Nov 2020 N. C. Shrikanth, Suvodeep Majumder, Tim Menzies

Hence, defect predictors learned from the first 150 commits and four months perform just as well as anything else.

Transfer Learning

Omni: Automated Ensemble with Unexpected Models against Adversarial Evasion Attack

no code implementations23 Nov 2020 Rui Shu, Tianpei Xia, Laurie Williams, Tim Menzies

Conclusion: When employing ensemble defense against adversarial evasion attacks, we suggest creating an ensemble with unexpected models that are distant from the attacker's expected model (i. e., target model) through methods such as hyperparameter optimization.

BIG-bench Machine Learning Ensemble Learning +2

Revisiting Process versus Product Metrics: a Large Scale Analysis

no code implementations21 Aug 2020 Suvodeep Majumder, Pranav Mody, Tim Menzies

We find that some analytics in-the-small conclusions still hold when scaling up to analytics in-the-large.

How Different is Test Case Prioritization for Open and Closed Source Projects?

no code implementations3 Aug 2020 Xiao Ling, Rishabh Agrawal, Tim Menzies

Improved test case prioritization means that software developers can detect and fix more software faults sooner than usual.

Software Engineering

Identifying Self-Admitted Technical Debts with Jitterbug: A Two-step Approach

2 code implementations25 Feb 2020 Zhe Yu, Fahmid Morshed Fahid, Huy Tu, Tim Menzies

Keeping track of and managing the self-admitted technical debts (SATDs) is important to maintaining a healthy software project.

Software Engineering

Methods for Stabilizing Models across Large Samples of Projects (with case studies on Predicting Defect and Project Health)

1 code implementation6 Nov 2019 Suvodeep Majumder, Tianpei Xia, Rahul Krishna, Tim Menzies

To the best of our knowledge, STABILIZER is order of magnitude faster than the prior state-of-the-art transfer learners which seek to find conclusion stability, and these case studies are the largest demonstration of the generalizability of quantitative predictions of project quality yet reported in the SE literature.

Transfer Learning

How to Better Distinguish Security Bug Reports (using Dual Hyperparameter Optimization

no code implementations4 Nov 2019 Rui Shu, Tianpei Xia, Jianfeng Chen, Laurie Williams, Tim Menzies

For example, in a study of security bug reports from the Chromium dataset, the median recalls of FARSEC and Swift were 15. 7% and 77. 4%, respectively.

Software Engineering

Whence to Learn? Transferring Knowledge in Configurable Systems using BEETLE

2 code implementations1 Nov 2019 Rahul Krishna, Vivek Nair, Pooyan Jamshidi, Tim Menzies

To resolve these problems, we propose a novel transfer learning framework called BEETLE, which is a "bellwether"-based transfer learner that focuses on identifying and learning from the most relevant source from amongst the old data.

Software Engineering

Better Technical Debt Detection via SURVEYing

1 code implementation20 May 2019 Fahmid M. Fahid, Zhe Yu, Tim Menzies

Specifically, for ten open-source JAVA projects, we can find 83% of the technical debt via SURVEY0 using just 16% of the comments (and if higher levels of recall are required, SURVEY0can adjust towards that with some additional effort).

TERMINATOR: Better Automated UI Test Case Prioritization

2 code implementations16 May 2019 Zhe Yu, Fahmid M. Fahid, Tim Menzies, Gregg Rothermel, Kyle Patrick, Snehit Cherian

Given that much of the automated UI testing is "black box" in nature, very little information (only the test case descriptions and testing results) can be utilized to prioritize these automated UI test cases.

Software Engineering for Fairness: A Case Study with Hyperparameter Optimization

no code implementations14 May 2019 Joymallya Chakraborty, Tianpei Xia, Fahmid M. Fahid, Tim Menzies

To the best of our knowledge, this is the first application of hyperparameter optimization as a tool for software engineers to generate fairer software.

BIG-bench Machine Learning Fairness +1

How to "DODGE" Complex Software Analytics?

no code implementations5 Feb 2019 Amritanshu Agrawal, Wei Fu, Di Chen, Xipeng Shen, Tim Menzies

Machine learning techniques applied to software engineering tasks can be improved by hyperparameter optimization, i. e., automatic tools that find good settings for a learner's control parameters.

BIG-bench Machine Learning Hyperparameter Optimization

Hyperparameter Optimization for Effort Estimation

1 code implementation28 Apr 2018 Tianpei Xia, Rahul Krishna, Jianfeng Chen, George Mathew, Xipeng Shen, Tim Menzies

We test OIL on a wide range of hyperparameter optimizers using data from 945 software projects.

Software Engineering

Transfer Learning with Bellwethers to find Good Configurations

3 code implementations11 Mar 2018 Vivek Nair, Rahul Krishna, Tim Menzies, Pooyan Jamshidi

Using this insight, this paper proposes BEETLE, a novel bellwether based transfer learning scheme, which can identify a suitable source and use it to find near-optimal configurations of a software system.

Software Engineering

500+ Times Faster Than Deep Learning (A Case Study Exploring Faster Methods for Text Mining StackOverflow)

no code implementations14 Feb 2018 Suvodeep Majumder, Nikhila Balaji, Katie Brey, Wei Fu, Tim Menzies

Deep learners utilizes extensive computational power and can take a long time to train-- making it difficult to widely validate and repeat and improve their results.

Clustering

Finding Faster Configurations using FLASH

1 code implementation7 Jan 2018 Vivek Nair, Zhe Yu, Tim Menzies, Norbert Siegmund, Sven Apel

FLASH scales up to software systems that defeat the prior state of the art model-based methods in this area.

Software Engineering

RIOT: a Stochastic-based Method for Workflow Scheduling in the Cloud

no code implementations27 Aug 2017 Jianfeng Chen, Tim Menzies

Cloud computing provides engineers or scientists a place to run complex computing tasks.

Cloud Computing Scheduling

Learning Effective Changes For Software Projects

1 code implementation17 Aug 2017 Rahul Krishna, Tim Menzies

The current generation of software analytics tools are mostly prediction algorithms (e. g. support vector machines, naive bayes, logistic regression, etc).

Software Engineering

Easy over Hard: A Case Study on Deep Learning

1 code implementation1 Mar 2017 Wei Fu, Tim Menzies

While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost.

Revisiting Unsupervised Learning for Defect Prediction

1 code implementation1 Mar 2017 Wei Fu, Tim Menzies

(1) There is much variability in the efficacy of the Yang et al. predictors so even with their approach, some supervised data is required to prune weaker predictors away.

Beyond Evolutionary Algorithms for Search-based Software Engineering

no code implementations27 Jan 2017 Jianfeng Chen, Vivek Nair, Tim Menzies

Context: Evolutionary algorithms typically require a large number of evaluations (of solutions) to converge - which can be very slow and expensive to evaluate. Objective: To solve search-based software engineering (SE) problems, using fewer evaluations than evolutionary methods. Method: Instead of mutating a small population, we build a very large initial population which is then culled using a recursive bi-clustering chop approach.

Clustering Evolutionary Algorithms

Faster Discovery of Faster System Configurations with Spectral Learning

no code implementations27 Jan 2017 Vivek Nair, Tim Menzies, Norbert Siegmund, Sven Apel

Despite the huge spread and economical importance of configurable software systems, there is unsatisfactory support in utilizing the full potential of these systems with respect to finding performance-optimal configurations.

Dimensionality Reduction

Why is Differential Evolution Better than Grid Search for Tuning Defect Predictors?

no code implementations8 Sep 2016 Wei Fu, Vivek Nair, Tim Menzies

In software analytics, at least for defect prediction, several methods, like grid search and differential evolution (DE), have been proposed to learn these parameters, which has been proved to be able to improve the performance scores of learners.

A deep learning model for estimating story points

no code implementations2 Sep 2016 Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, Trang Pham, Aditya Ghose, Tim Menzies

Although there has been substantial research in software analytics for effort estimation in traditional software projects, little work has been done for estimation in agile projects, especially estimating user stories or issues.

Feature Engineering

What is Wrong with Topic Modeling? (and How to Fix it Using Search-based Software Engineering)

no code implementations29 Aug 2016 Amritanshu Agrawal, Wei Fu, Tim Menzies

When run on different datasets, LDA suffers from "order effects" i. e. different topics are generated if the order of training data is shuffled.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.