no code implementations • 29 Nov 2023 • Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella
Recent work using pretrained transformers has shown impressive performance when fine-tuned with data from the downstream problem of interest.
1 code implementation • 24 Apr 2023 • Martin Wistuba, Martin Ferianc, Lukas Balles, Cedric Archambeau, Giovanni Zappella
We discuss requirements for the use of continual learning algorithms in practice, from which we derive design principles for Renate.
no code implementations • 21 Feb 2023 • Tristan Cinquin, Tammo Rukat, Philipp Schmidt, Martin Wistuba, Artur Bekasov
Variational inference is often used to implement Bayesian neural networks, but is difficult to apply to GBMs, because the decision trees used as weak learners are non-differentiable.
2 code implementations • 14 Jul 2022 • Ondrej Bohdal, Lukas Balles, Martin Wistuba, Beyza Ermis, Cédric Archambeau, Giovanni Zappella
Hyperparameter optimization (HPO) and neural architecture search (NAS) are methods of choice to obtain the best-in-class machine learning models, but in practice they can be costly to run.
no code implementations • 28 Jun 2022 • Beyza Ermis, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cedric Archambeau
This phenomenon is known as catastrophic forgetting and it is often difficult to prevent due to practical constraints, such as the amount of data that can be stored or the limited computation sources that can be used.
no code implementations • 9 Mar 2022 • Beyza Ermis, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cedric Archambeau
Moreover, applications increasingly rely on large pre-trained neural networks, such as pre-trained Transformers, since the resources or data might not be available in sufficiently large quantities to practitioners to train the model from scratch.
1 code implementation • 20 Feb 2022 • Martin Wistuba, Arlind Kadra, Josif Grabocka
Multi-fidelity (gray-box) hyperparameter optimization techniques (HPO) have recently emerged as a promising direction for tuning Deep Learning methods.
1 code implementation • 11 Jun 2021 • Sebastian Pineda Arango, Hadi S. Jomaa, Martin Wistuba, Josif Grabocka
Hyperparameter optimization (HPO) is a core problem for the machine learning community and remains largely unsolved due to the significant computational resources required to evaluate hyperparameter configurations.
no code implementations • ICML Workshop AutoML 2021 • Akihiro Kishimoto, Djallel Bouneffouf, Radu Marinescu, Parikshit Ram, Ambrish Rawat, Martin Wistuba, Paulito Pedregosa Palmes, Adi Botea
Optimizing a machine learning (ML) pipeline has been an important topic of AI and ML.
no code implementations • 22 Jan 2021 • Hadjer Benmeziane, Kaoutar El Maghraoui, Hamza Ouarnoughi, Smail Niar, Martin Wistuba, Naigang Wang
Arguably their most significant impact has been in image classification and object detection tasks where the state of the art results have been obtained.
Hardware Aware Neural Architecture Search Image Classification +3
1 code implementation • ICLR 2021 • Martin Wistuba, Josif Grabocka
Hyperparameter optimization (HPO) is a central pillar in the automation of machine learning solutions and is mainly performed via Bayesian optimization, where a parametric surrogate is learned to approximate the black box response function (e. g. validation error).
no code implementations • ICML 2020 • Martin Wistuba, Tejaswini Pedapati
Many automated machine learning methods, such as those for hyperparameter and neural architecture optimization, are computationally expensive because they involve training many different model configurations.
no code implementations • 22 Oct 2019 • Charu Aggarwal, Djallel Bouneffouf, Horst Samulowitz, Beat Buesser, Thanh Hoang, Udayan Khurana, Sijia Liu, Tejaswini Pedapati, Parikshit Ram, Ambrish Rawat, Martin Wistuba, Alexander Gray
Data science is labor-intensive and human experts are scarce but heavily involved in every aspect of it.
no code implementations • 18 Jul 2019 • Martin Wistuba
In experiments on CIFAR-10 and CIFAR-100, we observe a reduction in the search time from 200 to only 6 GPU days, a speed up by a factor of 33.
no code implementations • 4 May 2019 • Martin Wistuba, Ambrish Rawat, Tejaswini Pedapati
The growing interest in both the automation of machine learning and deep learning has inevitably led to the development of a wide variety of automated methods for neural architecture search.
no code implementations • 8 Mar 2019 • Martin Wistuba, Tejaswini Pedapati
First, we propose a novel neural architecture selection method which employs this knowledge to identify strong and weak characteristics of neural architectures across datasets.
no code implementations • 17 Jan 2019 • Atin Sood, Benjamin Elder, Benjamin Herta, Chao Xue, Costas Bekas, A. Cristiano I. Malossi, Debashish Saha, Florian Scheidegger, Ganesh Venkataraman, Gegi Thomas, Giovanni Mariani, Hendrik Strobelt, Horst Samulowitz, Martin Wistuba, Matteo Manica, Mihir Choudhury, Rong Yan, Roxana Istrate, Ruchir Puri, Tejaswini Pedapati
Application of neural networks to a vast variety of practical applications is transforming the way AI is applied in practice.
5 code implementations • 3 Jul 2018 • Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Martin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, Ian M. Molloy, Ben Edwards
Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary.
1 code implementation • 15 Jun 2018 • Tran Ngoc Minh, Mathieu Sinn, Hoang Thanh Lam, Martin Wistuba
Data preparation, i. e. the process of transforming raw data into a format that can be used for training effective machine learning models, is a tedious and time-consuming task.
no code implementations • 7 Jun 2018 • Martin Wistuba, Ambrish Rawat
We introduce a new Bayesian multi-class support vector machine by formulating a pseudo-likelihood for a multi-class hinge loss in the form of a location-scale mixture of Gaussians.
no code implementations • 16 Jan 2018 • Hoang Thanh Lam, Tran Ngoc Minh, Mathieu Sinn, Beat Buesser, Martin Wistuba
To the best of our knowledge, this is the first time an automated data science system could win medals in Kaggle competitions with complex relational database.
no code implementations • 20 Dec 2017 • Martin Wistuba
We adapt the UCT algorithm to the needs of network architecture search by proposing two ways of sharing information between different branches of the search tree.
no code implementations • 22 Nov 2017 • Ambrish Rawat, Martin Wistuba, Maria-Irina Nicolae
Deep Learning models are vulnerable to adversarial examples, i. e.\ images obtained via deliberate imperceptible perturbations, such that the model misclassifies them with high confidence.
no code implementations • SIAM 2017 2017 • Martin Wistuba, Nicolas Schilling, Lars Schmidt-Thieme
Automating machine learning by providing techniques that autonomously find the best algorithm, hyperparameter configuration and preprocessing is helpful for both researchers and practitioners.
no code implementations • 13 Oct 2016 • Martin Wistuba, Nghia Duong-Trung, Nicolas Schilling, Lars Schmidt-Thieme
We describe the solution of team ISMLL for the ECML-PKDD 2016 Discovery Challenge on Bank Card Usage for both tasks.
1 code implementation • ECML PKDD 2016 2016 • Nicolas Schilling, Martin Wistuba, Lars Schmidt-Thieme
In this paper, we use products of Gaussian process experts as surrogate models for hyperparameter optimization.
no code implementations • 17 Mar 2015 • Martin Wistuba, Josif Grabocka, Lars Schmidt-Thieme
A method for using shapelets for multivariate time series is proposed and Ultra-Fast Shapelets is proven to be successful in comparison to state-of-the-art multivariate time series classifiers on 15 multivariate time series datasets from various domains.
no code implementations • 11 Mar 2015 • Josif Grabocka, Martin Wistuba, Lars Schmidt-Thieme
Time-series classification is an important problem for the data mining community due to the wide range of application domains involving time-series data.
no code implementations • 24 Jul 2013 • Josif Grabocka, Martin Wistuba, Lars Schmidt-Thieme
The coefficients of the polynomial functions are converted to symbolic words via equivolume discretizations of the coefficients' distributions.