no code implementations • 2 Feb 2024 • Paweł Mąka, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis
In this study, we show that a special case of multi-encoder architecture, where the latent representation of the source sentence is cached and reused as the context in the next step, achieves higher accuracy on the contrastive datasets (where the models have to rank the correct translation among the provided sentences) and comparable BLEU and COMET scores as the single- and multi-encoder approaches.
1 code implementation • 28 Jun 2021 • Onur Bilgin, Paweł Mąka, Thomas Vergutz, Siamak Mehrkanoon
We show that compared to the classical encoder transformer, 3D convolutional neural networks, LSTM, and Convolutional LSTM, the proposed TENT model can better learn the underlying complex pattern of the weather data for the studied temperature prediction task.