1 code implementation • 20 Feb 2024 • Ehsan Imani, Kai Luedemann, Sam Scholnick-Hughes, Esraa Elelimy, Martha White
It is becoming increasingly common in regression to train neural networks that model the entire distribution even if only the mean is required for prediction.
1 code implementation • 24 Oct 2023 • Subhojeet Pramanik, Esraa Elelimy, Marlos C. Machado, Adam White
In this paper we introduce recurrent alternatives to the transformer self-attention mechanism that offer a context-independent inference cost, leverage long-range dependencies effectively, and perform well in practice.