Search Results for author: Stefan Kesselheim

Found 5 papers, 2 papers with code

Tokenizer Choice For LLM Training: Negligible or Crucial?

no code implementations • 12 Oct 2023 • Mehdi Ali, Michael Fromm, Klaudia Thellmann, Richard Rutmann, Max Lübbering, Johannes Leveling, Katrin Klug, Jan Ebert, Niclas Doll, Jasper Schulze Buschhoff, Charvi Jain, Alexander Arno Weber, Lena Jurkschat, Hammam Abdelwahab, Chelsea John, Pedro Ortiz Suarez, Malte Ostendorff, Samuel Weinbach, Rafet Sifa, Stefan Kesselheim, Nicolas Flores-Herr

The recent success of Large Language Models (LLMs) has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as a blind spot.

Paper
Add Code

Physics informed Neural Networks applied to the description of wave-particle resonance in kinetic simulations of fusion plasmas

no code implementations • 23 Aug 2023 • Jai Kumar, David Zarzoso, Virginie Grandgirard, Jan Ebert, Stefan Kesselheim

The Vlasov-Poisson system is employed in its reduced form version (1D1V) as a test bed for the applicability of Physics Informed Neural Network (PINN) to the wave-particle resonance.

Paper
Add Code

A Comparative Study on Generative Models for High Resolution Solar Observation Imaging

1 code implementation • 14 Apr 2023 • Mehdi Cherti, Alexander Czernik, Stefan Kesselheim, Frederic Effenberger, Jenia Jitsev

Starting from StyleGAN-based methods, we uncover severe deficits of this model family in handling fine-scale details of solar images when training on high resolution samples, contrary to training on natural face images.

Paper
Code

Hearts Gym: Learning Reinforcement Learning as a Team Event

1 code implementation • 7 Sep 2022 • Jan Ebert, Danimir T. Doncevic, Ramona Kloß, Stefan Kesselheim

Amidst the COVID-19 pandemic, the authors of this paper organized a Reinforcement Learning (RL) course for a graduate school in the field of data science.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

JUWELS Booster -- A Supercomputer for Large-Scale AI Research

no code implementations • 30 Jun 2021 • Stefan Kesselheim, Andreas Herten, Kai Krajsek, Jan Ebert, Jenia Jitsev, Mehdi Cherti, Michael Langguth, Bing Gong, Scarlet Stadtler, Amirpasha Mozaffari, Gabriele Cavallaro, Rocco Sedona, Alexander Schug, Alexandre Strube, Roshni Kamath, Martin G. Schultz, Morris Riedel, Thomas Lippert

In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the J\"ulich Supercomputing Center.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.