1 code implementation • 18 Apr 2024 • Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren
We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.
1 code implementation • 13 Mar 2024 • Florian Tambon, Arghavan Moradi Dakhel, Amin Nikanjam, Foutse khomh, Michel C. Desmarais, Giuliano Antoniol
The bug patterns are presented in the form of a taxonomy.
1 code implementation • 14 Feb 2024 • Vahid Majdinasab, Amin Nikanjam, Foutse khomh
Therefore, auditing code developed using LLMs is challenging, as it is difficult to reliably assert if an LLM used during development has been trained on specific copyrighted codes, given that we do not have access to the training datasets of these models.
1 code implementation • 3 Oct 2023 • Pierre-Olivier Côté, Amin Nikanjam, Nafisa Ahmed, Dmytro Humeniuk, Foutse khomh
First, it aims to summarize the latest approaches for data cleaning for ML and ML for data cleaning.
1 code implementation • 23 Aug 2023 • Ahmed Haj Yahmed, Altaf Allah Abbassi, Amin Nikanjam, Heng Li, Foutse khomh
In this paper, we propose an empirical study on Stack Overflow (SO), the most popular Q&A forum for developers, to uncover and understand the challenges practitioners faced when deploying DRL systems.
1 code implementation • 26 Jul 2023 • Mohammad Mehdi Morovati, Amin Nikanjam, Florian Tambon, Foutse khomh, Zhen Ming, Jiang
Based on our results, fixing ML bugs are more costly and ML components are more error-prone, compared to non-ML bugs and non-ML components respectively.
1 code implementation • 26 Jun 2023 • Pierre-Olivier Côté, Amin Nikanjam, Rached Bouchoucha, Ilan Basta, Mouna Abidi, Foutse khomh
We validate the identified quality issues via a survey with ML practitioners.
no code implementations • 6 May 2023 • Zeynab Chitsazian, Saeed Sedighian Kashi, Amin Nikanjam
We then compared the output of the proposed methods with baseline methods based on performance monitoring of threshold-dependent and threshold-independent criteria using well-known performance measures in CD detection methods, such as accuracy, MDR, MTD, MTFA, and MTR.
1 code implementation • 13 Jan 2023 • Florian Tambon, Vahid Majdinasab, Amin Nikanjam, Foutse khomh, Giuliano Antonio
This allows us to compare different mutation killing definitions based on existing approaches, as well as to analyze the behavior of the obtained mutation operators and their potential combinations called Higher Order Mutation(s) (HOM).
1 code implementation • 25 Aug 2022 • Paulina Stevia Nouwou Mindom, Amin Nikanjam, Foutse khomh
In this paper, we empirically investigate the applications of carefully selected DRL algorithms on two important software testing tasks: test case prioritization in the context of Continuous Integration (CI) and game testing.
1 code implementation • 18 Aug 2022 • Pierre-Olivier Côté, Amin Nikanjam, Rached Bouchoucha, Foutse khomh
This empirical study aims to identify a catalog of bad-practices related to poor quality in MLSSs.
1 code implementation • 30 Jun 2022 • Arghavan Moradi Dakhel, Vahid Majdinasab, Amin Nikanjam, Foutse khomh, Michel C. Desmarais, Zhen Ming, Jiang
In this paper, we study the capabilities of Copilot in two different programming tasks: (i) generating (and reproducing) correct and efficient solutions for fundamental algorithmic problems, and (ii) comparing Copilot's proposed solutions with those of human programmers on a set of programming tasks.
1 code implementation • 28 Jun 2022 • Moses Openja, Amin Nikanjam, Ahmed Haj Yahmed, Foutse khomh, Zhen Ming, Jiang
Usually DL models are developed and trained using DL frameworks that have their own internal mechanisms/formats to represent and train DL models, and usually those formats cannot be recognized by other frameworks.
no code implementations • 24 Jun 2022 • Mohammad Mehdi Morovati, Amin Nikanjam, Foutse khomh, Zhen Ming, Jiang
Although most of these tools use bugs' lifecycle, there is no standard benchmark of bugs to assess their performance, compare them and discuss their advantages and weaknesses.
no code implementations • 24 Feb 2022 • Saeed Ghadiri, Amin Nikanjam
This information is represented as a probabilistic model and the effectiveness of these algorithms is dependent on the quality of these models.
1 code implementation • 26 Dec 2021 • Florian Tambon, Amin Nikanjam, Le An, Foutse khomh, Giuliano Antoniol
This paper presents the first empirical study of Keras and TensorFlow silent bugs, and their impact on users' programs.
no code implementations • 8 Nov 2021 • Paulina Stevia Nouwou Mindom, Amin Nikanjam, Foutse khomh, John Mullins
The increasing adoption of Reinforcement Learning in safety-critical systems domains such as autonomous vehicles, health, and aviation raises the need for ensuring their safety.
1 code implementation • 9 Sep 2021 • Emilio Rivera-Landos, Foutse khomh, Amin Nikanjam
This study attempts to quantify the impact that the occurrence of bugs in a popular ML framework, PyTorch, has on the performance of trained models.
1 code implementation • 21 Aug 2021 • Nazanin Shajoonnezhad, Amin Nikanjam
Compared to continuous Bayesian networks, learning a discrete Bayesian network is a challenging problem due to the large parameter space.
1 code implementation • 26 Jul 2021 • Florian Tambon, Gabriel Laberge, Le An, Amin Nikanjam, Paulina Stevia Nouwou Mindom, Yann Pequignot, Foutse khomh, Giulio Antoniol, Ettore Merlo, François Laviolette
Method: We conduct a Systematic Literature Review (SLR) of research papers published between 2015 to 2020, covering topics related to the certification of ML systems.
no code implementations • 20 Jul 2021 • Mahnoosh Mahdavimoghaddam, Amin Nikanjam, Monireh Abdoos
In the proposed communication framework, agents learn to cooperate effectively and also by introduction of a new state calculation method the size of state space will decline considerably.
Multi-agent Reinforcement Learning reinforcement-learning +2
1 code implementation • 1 Jan 2021 • Amin Nikanjam, Mohammad Mehdi Morovati, Foutse khomh, Houssem Ben Braiek
To allow for the automatic detection of faults in DRL programs, we have defined a meta-model of DRL programs and developed DRLinter, a model-based fault detection approach that leverages static analysis and graph transformations.