Search Results for author: Tomer Lancewicki

Found 6 papers, 2 papers with code

ITEm: Unsupervised Image-Text Embedding Learning for eCommerce

no code implementations • 22 Oct 2023 • Baohao Liao, Michael Kozielski, Sanjika Hewavitharana, Jiangbo Yuan, Shahram Khadivi, Tomer Lancewicki

How to teach a model to learn embedding from different modalities without neglecting information from the less dominant modality is challenging.

Paper
Add Code

Multi-armed bandits for resource efficient, online optimization of language model pre-training: the use case of dynamic masking

1 code implementation • 24 Mar 2022 • Iñigo Urteaga, Moulay-Zaïdane Draïdia, Tomer Lancewicki, Shahram Khadivi

We propose a multi-armed bandit framework for the sequential selection of TLM pre-training hyperparameters, aimed at optimizing language model performance, in a resource efficient manner.

Bayesian Optimization Decision Making +3

Paper
Code

Towards Reinforcement Learning for Pivot-based Neural Machine Translation with Non-autoregressive Transformer

no code implementations • 27 Sep 2021 • Evgeniia Tokarchuk, Jan Rosendahl, Weiyue Wang, Pavel Petrushkov, Tomer Lancewicki, Shahram Khadivi, Hermann Ney

Pivot-based neural machine translation (NMT) is commonly used in low-resource setups, especially for translation between non-English language pairs.

Machine Translation NMT +4

Paper
Add Code

Integrated Training for Sequence-to-Sequence Models Using Non-Autoregressive Transformer

no code implementations • ACL (IWSLT) 2021 • Evgeniia Tokarchuk, Jan Rosendahl, Weiyue Wang, Pavel Petrushkov, Tomer Lancewicki, Shahram Khadivi, Hermann Ney

Complex natural language applications such as speech translation or pivot translation traditionally rely on cascaded models.

Machine Translation Translation

Paper
Add Code

Deploying a BERT-based Query-Title Relevance Classifier in a Production System: a View from the Trenches

no code implementations • 23 Aug 2021 • Leonard Dahlmann, Tomer Lancewicki

We successfully optimize a Query-Title Relevance (QTR) classifier for deployment via a compact model, which we name BERT Bidirectional Long Short-Term Memory (BertBiLSTM).

Data Augmentation Knowledge Distillation +5

Paper
Add Code

Automatic and Simultaneous Adjustment of Learning Rate and Momentum for Stochastic Gradient Descent

1 code implementation • 20 Aug 2019 • Tomer Lancewicki, Selcuk Kopru

Stochastic Gradient Descent (SGD) methods are prominent for training machine learning and deep learning models.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.