Search Results for author: Abdelrahman Shaker

Found 10 papers, 9 papers with code

Efficient Video Object Segmentation via Modulated Cross-Attention Memory

1 code implementation • 26 Mar 2024 • Abdelrahman Shaker, Syed Talal Wasim, Martin Danelljan, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan

Recently, transformer-based approaches have shown promising results for semi-supervised video object segmentation.

Object Segmentation +3

Paper
Code

PALO: A Polyglot Large Multimodal Model for 5B People

1 code implementation • 22 Feb 2024 • Muhammad Maaz, Hanoona Rasheed, Abdelrahman Shaker, Salman Khan, Hisham Cholakal, Rao M. Anwer, Tim Baldwin, Michael Felsberg, Fahad S. Khan

PALO offers visual reasoning capabilities in 10 major languages, including English, Chinese, Hindi, Spanish, French, Arabic, Bengali, Russian, Urdu, and Japanese, that span a total of ~5B people (65% of the world population).

Language Modelling Large Language Model +1

Paper
Code

Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM

1 code implementation • 14 Dec 2023 • Sahal Shaji Mullappilly, Abdelrahman Shaker, Omkar Thawakar, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Fahad Shahbaz Khan

To this end, we propose a light-weight Arabic Mini-ClimateGPT that is built on an open-source LLM and is specifically fine-tuned on a conversational-style instruction tuning curated Arabic dataset Clima500-Instruct with over 500k instructions about climate change and sustainability.

Paper
Code

GLaMM: Pixel Grounding Large Multimodal Model

1 code implementation • 6 Nov 2023 • Hanoona Rasheed, Muhammad Maaz, Sahal Shaji Mullappilly, Abdelrahman Shaker, Salman Khan, Hisham Cholakkal, Rao M. Anwer, Erix Xing, Ming-Hsuan Yang, Fahad S. Khan

In this work, we present Grounding LMM (GLaMM), the first model that can generate natural language responses seamlessly intertwined with corresponding object segmentation masks.

Conversational Question Answering Image Captioning +5

575

Paper
Code

Learnable Weight Initialization for Volumetric Medical Image Segmentation

1 code implementation • 15 Jun 2023 • Shahina Kunhimon, Abdelrahman Shaker, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan

Hybrid volumetric medical image segmentation models, combining the advantages of local convolution and global attention, have recently received considerable attention.

Image Segmentation Organ Segmentation +3

Paper
Code

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models

1 code implementation • 13 Jun 2023 • Omkar Thawkar, Abdelrahman Shaker, Sahal Shaji Mullappilly, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen, Fahad Shahbaz Khan

The latest breakthroughs in large vision-language models, such as Bard and GPT-4, have showcased extraordinary abilities in performing a wide range of tasks.

Language Modelling Large Language Model

422

Paper
Code

SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications

2 code implementations • ICCV 2023 • Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan

Using our proposed efficient additive attention, we build a series of models called "SwiftFormer" which achieves state-of-the-art performance in terms of both accuracy and mobile inference speed.

328

Paper
Code

UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation

2 code implementations • 8 Dec 2022 • Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan

Owing to the success of transformer models, recent works study their applicability in 3D medical segmentation tasks.

Image Segmentation Medical Image Segmentation +2

279

Paper
Code

EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications

7 code implementations • 21 Jun 2022 • Muhammad Maaz, Abdelrahman Shaker, Hisham Cholakkal, Salman Khan, Syed Waqas Zamir, Rao Muhammad Anwer, Fahad Shahbaz Khan

Our EdgeNeXt model with 1. 3M parameters achieves 71. 2% top-1 accuracy on ImageNet-1K, outperforming MobileViT with an absolute gain of 2. 2% with 28% reduction in FLOPs.

Ranked #29 on Semantic Segmentation on PASCAL VOC 2012 test

Image Classification Object Detection +1

29,758

Paper
Code

INSTA-YOLO: Real-Time Instance Segmentation

no code implementations • 12 Feb 2021 • Eslam Mohamed, Abdelrahman Shaker, Ahmad El-Sallab, Mayada Hadhoud

We compare our results to the state-of-the-art models for instance segmentation.

Real-time Instance Segmentation Segmentation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.