no code implementations • 22 Oct 2023 • Baohao Liao, Michael Kozielski, Sanjika Hewavitharana, Jiangbo Yuan, Shahram Khadivi, Tomer Lancewicki
How to teach a model to learn embedding from different modalities without neglecting information from the less dominant modality is challenging.
1 code implementation • 24 Mar 2022 • Iñigo Urteaga, Moulay-Zaïdane Draïdia, Tomer Lancewicki, Shahram Khadivi
We propose a multi-armed bandit framework for the sequential selection of TLM pre-training hyperparameters, aimed at optimizing language model performance, in a resource efficient manner.
no code implementations • 27 Sep 2021 • Evgeniia Tokarchuk, Jan Rosendahl, Weiyue Wang, Pavel Petrushkov, Tomer Lancewicki, Shahram Khadivi, Hermann Ney
Pivot-based neural machine translation (NMT) is commonly used in low-resource setups, especially for translation between non-English language pairs.
no code implementations • ACL (IWSLT) 2021 • Evgeniia Tokarchuk, Jan Rosendahl, Weiyue Wang, Pavel Petrushkov, Tomer Lancewicki, Shahram Khadivi, Hermann Ney
Complex natural language applications such as speech translation or pivot translation traditionally rely on cascaded models.
no code implementations • 23 Aug 2021 • Leonard Dahlmann, Tomer Lancewicki
We successfully optimize a Query-Title Relevance (QTR) classifier for deployment via a compact model, which we name BERT Bidirectional Long Short-Term Memory (BertBiLSTM).
1 code implementation • 20 Aug 2019 • Tomer Lancewicki, Selcuk Kopru
Stochastic Gradient Descent (SGD) methods are prominent for training machine learning and deep learning models.