no code implementations • 7 Feb 2024 • Shuoyuan Wang, Jindong Wang, Guoqing Wang, Bob Zhang, Kaiyang Zhou, Hongxin Wei
Vision-language models (VLMs) have emerged as formidable tools, showing their strong capability in handling various open-vocabulary tasks in image recognition, text-driven visual content generation, and visual chatbots, to name a few.
1 code implementation • 28 Oct 2023 • Shuoyuan Wang, Jindong Wang, Huajun Xi, Bob Zhang, Lei Zhang, Hongxin Wei
However, the high computational cost of optimization-based TTA algorithms makes it intractable to run on resource-constrained edge devices.