Search Results for author: Josif Grabocka

Found 37 papers, 19 papers with code

Multi-objective Differentiable Neural Architecture Search

1 code implementation • 28 Feb 2024 • Rhea Sanjay Sukthanker, Arber Zela, Benedikt Staffler, Samuel Dooley, Josif Grabocka, Frank Hutter

Pareto front profiling in multi-objective optimization (MOO), i. e. finding a diverse set of Pareto optimal solutions, is challenging, especially with expensive objectives like neural network training.

Machine Translation Neural Architecture Search

Paper
Code

Hierarchical Transformers are Efficient Meta-Reinforcement Learners

no code implementations • 9 Feb 2024 • Gresa Shala, André Biedenkapp, Josif Grabocka

We introduce Hierarchical Transformers for Meta-Reinforcement Learning (HTrMRL), a powerful online meta-reinforcement learning approach.

Meta Reinforcement Learning reinforcement-learning

Paper
Add Code

Tabular Data: Is Attention All You Need?

no code implementations • 6 Feb 2024 • Guri Zabërgja, Arlind Kadra, Josif Grabocka

In this paper, we introduce a large-scale empirical study comparing neural networks against gradient-boosted decision trees on tabular data, but also transformer-based architectures against traditional multi-layer perceptrons (MLP) with residual connections.

Paper
Add Code

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

1 code implementation • 6 Jun 2023 • Sebastian Pineda Arango, Fabio Ferreira, Arlind Kadra, Frank Hutter, Josif Grabocka

With the ever-increasing number of pretrained models, machine learning practitioners are continuously faced with which pretrained model to use, and how to finetune it for a new dataset.

Hyperparameter Optimization Image Classification

Paper
Code

Deep Pipeline Embeddings for AutoML

1 code implementation • 23 May 2023 • Sebastian Pineda Arango, Josif Grabocka

As a remedy, this paper proposes a novel neural architecture that captures the deep interaction between the components of a Machine Learning pipeline.

Automatic Machine Learning Model Selection Bayesian Optimization +2

Paper
Code

Breaking the Paradox of Explainable Deep Learning

1 code implementation • 22 May 2023 • Arlind Kadra, Sebastian Pineda Arango, Josif Grabocka

Through extensive experiments, we demonstrate that our explainable deep networks are as accurate as state-of-the-art classifiers on tabular data.

Paper
Code

Phantom Embeddings: Using Embedding Space for Model Regularization in Deep Neural Networks

no code implementations • 14 Apr 2023 • Mofassir ul Islam Arif, Mohsan Jameel, Josif Grabocka, Lars Schmidt-Thieme

We create phantom embeddings from a subset of homogenous samples and use these phantom embeddings to decrease the inter-class similarity of instances in their latent embedding space.

Image Classification

Paper
Add Code

Deep Ranking Ensembles for Hyperparameter Optimization

1 code implementation • 27 Mar 2023 • Abdus Salam Khazi, Sebastian Pineda Arango, Josif Grabocka

Automatically optimizing the hyperparameters of Machine Learning algorithms is one of the primary open questions in AI.

Hyperparameter Optimization Learning-To-Rank

Paper
Code

Zero-Shot AutoML with Pretrained Models

1 code implementation • 16 Jun 2022 • Ekrem Öztürk, Fabio Ferreira, Hadi S. Jomaa, Lars Schmidt-Thieme, Josif Grabocka, Frank Hutter

Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small?

AutoML Meta-Learning

Paper
Code

Supervising the Multi-Fidelity Race of Hyperparameter Configurations

1 code implementation • 20 Feb 2022 • Martin Wistuba, Arlind Kadra, Josif Grabocka

Multi-fidelity (gray-box) hyperparameter optimization techniques (HPO) have recently emerged as a promising direction for tuning Deep Learning methods.

Bayesian Optimization Gaussian Processes +1

Paper
Code

Transformers Can Do Bayesian Inference

1 code implementation • ICLR 2022 • Samuel Müller, Noah Hollmann, Sebastian Pineda Arango, Josif Grabocka, Frank Hutter

Our method restates the objective of posterior approximation as a supervised classification problem with a set-valued input: it repeatedly draws a task (or function) from the prior, draws a set of data points and their labels from it, masks one of the labels and learns to make probabilistic predictions for it based on the set-valued input of the rest of the data points.

AutoML Bayesian Inference +2

165

Paper
Code

Multi-task problems are not multi-objective

1 code implementation • 14 Oct 2021 • Michael Ruchte, Josif Grabocka

These works also use Multi-Task Learning (MTL) problems to benchmark MOO algorithms treating each task as independent objective.

BIG-bench Machine Learning Multi-Task Learning

Paper
Code

Transfer Learning for Bayesian HPO with End-to-End Meta-Features

no code implementations • 29 Sep 2021 • Hadi Samer Jomaa, Sebastian Pineda Arango, Lars Schmidt-Thieme, Josif Grabocka

As a result, our novel DKLM can learn contextualized dataset-specific similarity representations for hyperparameter configurations.

Hyperparameter Optimization Transfer Learning

Paper
Add Code

Well-tuned Simple Nets Excel on Tabular Datasets

1 code implementation • NeurIPS 2021 • Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

Tabular datasets are the last "unconquered castle" for deep learning, with traditional ML methods like Gradient-Boosted Decision Trees still performing strongly even against recent specialized neural architectures.

Paper
Code

HPO-B: A Large-Scale Reproducible Benchmark for Black-Box HPO based on OpenML

1 code implementation • 11 Jun 2021 • Sebastian Pineda Arango, Hadi S. Jomaa, Martin Wistuba, Josif Grabocka

Hyperparameter optimization (HPO) is a core problem for the machine learning community and remains largely unsolved due to the significant computational resources required to evaluate hyperparameter configurations.

Hyperparameter Optimization Transfer Learning

Paper
Code

Scalable Pareto Front Approximation for Deep Multi-Objective Learning

1 code implementation • 24 Mar 2021 • Michael Ruchte, Josif Grabocka

Prior work either demand optimizing a new network for every point on the Pareto front, or induce a large overhead to the number of trainable parameters by using hyper-networks conditioned on modifiable preferences.

Paper
Code

Hyperparameter Optimization with Differentiable Metafeatures

no code implementations • 7 Feb 2021 • Hadi S. Jomaa, Lars Schmidt-Thieme, Josif Grabocka

In contrast to existing models, DMFBS i) integrates a differentiable metafeature extractor and ii) is optimized using a novel multi-task loss, linking manifold regularization with a dataset similarity measure learned via an auxiliary dataset identification meta-task, effectively enforcing the response approximation for similar datasets to be similar.

Hyperparameter Optimization

Paper
Add Code

Few-Shot Bayesian Optimization with Deep Kernel Surrogates

1 code implementation • ICLR 2021 • Martin Wistuba, Josif Grabocka

Hyperparameter optimization (HPO) is a central pillar in the automation of machine learning solutions and is mainly performed via Bayesian optimization, where a parametric surrogate is learned to approximate the black box response function (e. g. validation error).

Bayesian Optimization Few-Shot Learning +2

Paper
Code

NASLib: A Modular and Flexible Neural Architecture Search Library

1 code implementation • 1 Jan 2021 • Michael Ruchte, Arber Zela, Julien Niklas Siems, Josif Grabocka, Frank Hutter

Neural Architecture Search (NAS) is one of the focal points for the Deep Learning community, but reproducing NAS methods is extremely challenging due to numerous low-level implementation details.

Neural Architecture Search

495

Paper
Code

Regularization Cocktails

no code implementations • 1 Jan 2021 • Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

The regularization of prediction models is arguably the most crucial ingredient that allows Machine Learning solutions to generalize well on unseen data.

Hyperparameter Optimization

Paper
Add Code

Zero-shot Transfer Learning for Gray-box Hyper-parameter Optimization

no code implementations • 1 Jan 2021 • Hadi Samer Jomaa, Lars Schmidt-Thieme, Josif Grabocka

Zero-shot hyper-parameter optimization refers to the process of selecting hyper- parameter configurations that are expected to perform well for a given dataset upfront, without access to any observations of the losses of the target response.

Bayesian Optimization Transfer Learning

Paper
Add Code

HIDRA: Head Initialization across Dynamic targets for Robust Architectures

1 code implementation • 28 Oct 2019 • Rafael Rego Drumond, Lukas Brinkmeyer, Josif Grabocka, Lars Schmidt-Thieme

In this paper, we present HIDRA, a meta-learning approach that enables training and evaluating across tasks with any number of target variables.

Meta-Learning

Paper
Code

Chameleon: Learning Model Initializations Across Tasks With Different Schemas

1 code implementation • 30 Sep 2019 • Lukas Brinkmeyer, Rafael Rego Drumond, Randolf Scholz, Josif Grabocka, Lars Schmidt-Thieme

Parametric models, and particularly neural networks, require weight initialization as a starting point for gradient-based optimization.

Meta-Learning

Paper
Code

Atomic Compression Networks

no code implementations • 25 Sep 2019 • Jonas Falkner, Josif Grabocka, Lars Schmidt-Thieme

Compressed forms of deep neural networks are essential in deploying large-scale computational models on resource-constrained devices.

Model Compression

Paper
Add Code

Hyp-RL : Hyperparameter Optimization by Reinforcement Learning

1 code implementation • 27 Jun 2019 • Hadi S. Jomaa, Josif Grabocka, Lars Schmidt-Thieme

More recently, methods have been introduced that build a so-called surrogate model that predicts the validation loss for a specific hyperparameter setting, model and dataset and then sequentially select the next hyperparameter to test, based on a heuristic function of the expected value and the uncertainty of the surrogate model called acquisition function (sequential model-based Bayesian optimization, SMBO).

Bayesian Optimization Hyperparameter Optimization +2

Paper
Code

In Hindsight: A Smooth Reward for Steady Exploration

no code implementations • 24 Jun 2019 • Hadi S. Jomaa, Josif Grabocka, Lars Schmidt-Thieme

In classical Q-learning, the objective is to maximize the sum of discounted rewards through iteratively using the Bellman equation as an update, in an attempt to estimate the action value function of the optimal policy.

Atari Games Q-Learning

Paper
Add Code

Dataset2Vec: Learning Dataset Meta-Features

1 code implementation • 27 May 2019 • Hadi S. Jomaa, Lars Schmidt-Thieme, Josif Grabocka

As a data-driven approach, meta-learning requires meta-features that represent the primary learning tasks or datasets, and are estimated traditonally as engineered dataset statistics that require expert domain knowledge tailored for every meta-task.

Auxiliary Learning Few-Shot Learning +1

Paper
Code

Learning Surrogate Losses

no code implementations • 24 May 2019 • Josif Grabocka, Randolf Scholz, Lars Schmidt-Thieme

Ultimately, the surrogate losses are learned jointly with the prediction model via bilevel optimization.

Bilevel Optimization General Classification

Paper
Add Code

Multi-Label Network Classification via Weighted Personalized Factorizations

no code implementations • 25 Feb 2019 • Ahmed Rashed, Josif Grabocka, Lars Schmidt-Thieme

It can be formalized as a multi-relational learning task for predicting nodes labels based on their relations within the network.

Classification General Classification +2

Paper
Add Code

Data-Driven Vehicle Trajectory Forecasting

no code implementations • 9 Feb 2019 • Shayan Jawed, Eya Boumaiza, Josif Grabocka, Lars Schmidt-Thieme

An active area of research is to increase the safety of self-driving vehicles.

Trajectory Forecasting

Paper
Add Code

NeuralWarp: Time-Series Similarity with Warping Networks

2 code implementations • 20 Dec 2018 • Josif Grabocka, Lars Schmidt-Thieme

Research on time-series similarity measures has emphasized the need for elastic methods which align the indices of pairs of time series and a plethora of non-parametric have been proposed for the task.

Sentence Sentence Similarity +2

Paper
Code

Channel masking for multivariate time series shapelets

no code implementations • 2 Nov 2017 • Dripta S. Raychaudhuri, Josif Grabocka, Lars Schmidt-Thieme

Time series shapelets are discriminative sub-sequences and their similarity to time series can be used for time series classification.

General Classification Time Series +2

Paper
Add Code

Optimal Time-Series Motifs

no code implementations • 3 May 2015 • Josif Grabocka, Nicolas Schilling, Lars Schmidt-Thieme

We demonstrate that searching is non-optimal since the domain of motifs is restricted, and instead we propose a principled optimization approach able to find optimal motifs.

Time Series Time Series Analysis

Paper
Add Code

Ultra-Fast Shapelets for Time Series Classification

no code implementations • 17 Mar 2015 • Martin Wistuba, Josif Grabocka, Lars Schmidt-Thieme

A method for using shapelets for multivariate time series is proposed and Ultra-Fast Shapelets is proven to be successful in comparison to state-of-the-art multivariate time series classifiers on 15 multivariate time series datasets from various domains.

Classification General Classification +3

Paper
Add Code

Scalable Discovery of Time-Series Shapelets

no code implementations • 11 Mar 2015 • Josif Grabocka, Martin Wistuba, Lars Schmidt-Thieme

Time-series classification is an important problem for the data mining community due to the wide range of application domains involving time-series data.

Clustering General Classification +4

Paper
Add Code

Invariant Factorization Of Time-Series

no code implementations • 23 Dec 2013 • Josif Grabocka, Lars Schmidt-Thieme

Time-series classification is an important domain of machine learning and a plethora of methods have been developed for the task.

Time Series Time Series Analysis +1

Paper
Add Code

Time-Series Classification Through Histograms of Symbolic Polynomials

no code implementations • 24 Jul 2013 • Josif Grabocka, Martin Wistuba, Lars Schmidt-Thieme

The coefficients of the polynomial functions are converted to symbolic words via equivolume discretizations of the coefficients' distributions.

Classification Econometrics +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.