no code implementations • 3 Mar 2024 • Mehran Hosseini, Peyman Hosseini
We introduce three new attention mechanisms that outperform standard multi-head attention in terms of efficiency and learning capabilities, thereby improving the performance and broader deployability of Transformer models.
no code implementations • 3 Jan 2024 • Mehran Hosseini, Peyman Hosseini
The enduring inability of image generative models to recreate intricate geometric features, such as those present in human hands and fingers has been an ongoing problem in image generation for nearly a decade.
1 code implementation • 4 Mar 2023 • Peyman Hosseini, Mehran Hosseini, Sana Sabah Al-Azzawi, Marcus Liwicki, Ignacio Castro, Matthew Purver
We study the influence of different activation functions in the output layer of deep neural network models for soft and hard label prediction in the learning with disagreement task.