Search Results for author: Ajinkya Kale

Found 14 papers, 4 papers with code

Towards Enhanced Controllability of Diffusion Models

no code implementations • 28 Feb 2023 • Wonwoong Cho, Hareesh Ravi, Midhun Harikumar, Vinh Khuc, Krishna Kumar Singh, Jingwan Lu, David I. Inouye, Ajinkya Kale

We rely on the inductive bias of the progressive denoising process of diffusion models to encode pose/layout information in the spatial structure mask and semantic/style information in the style code.

Denoising Image Manipulation +3

Paper
Add Code

Controlled and Conditional Text to Image Generation with Diffusion Prior

no code implementations • 23 Feb 2023 • Pranav Aggarwal, Hareesh Ravi, Naveen Marri, Sachin Kelkar, Fengbin Chen, Vinh Khuc, Midhun Harikumar, Ritiz Tambi, Sudharshan Reddy Kakumanu, Purvak Lapsiya, Alvin Ghouas, Sarah Saber, Malavika Ramprasad, Baldo Faieta, Ajinkya Kale

We observe that Diffusion Prior can be used in a memory and compute efficient way to constrain the generation to a specific domain without altering the larger Diffusion Decoder.

Decoder Denoising +2

Paper
Add Code

PRedItOR: Text Guided Image Editing with Diffusion Prior

no code implementations • 15 Feb 2023 • Hareesh Ravi, Sachin Kelkar, Midhun Harikumar, Ajinkya Kale

We combine this with structure preserving edits on the image decoder using existing approaches such as reverse DDIM to perform text guided image editing.

Decoder text-guided-image-editing

Paper
Add Code

Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models

1 code implementation • CVPR 2023 • Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang

Based on this finding, we further propose a simple, light-weight image editing algorithm where the mixing weights of the two text embeddings are optimized for style matching and content preservation.

Denoising Disentanglement

150

Paper
Code

Fine-grained Image Captioning with CLIP Reward

1 code implementation • Findings (NAACL) 2022 • Jaemin Cho, Seunghyun Yoon, Ajinkya Kale, Franck Dernoncourt, Trung Bui, Mohit Bansal

Toward more descriptive and distinctive caption generation, we propose using CLIP, a multimodal encoder trained on huge image-text pairs from web, to calculate multimodal similarity and use it as a reward function.

Ranked #26 on Image Captioning on COCO Captions

Caption Generation Descriptive +5

226

Paper
Code

StyleBabel: Artistic Style Tagging and Captioning

no code implementations • 10 Mar 2022 • Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse

We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools.

Attribute Representation Learning +2

Paper
Add Code

EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval

no code implementations • CVPR 2022 • Haoyu Ma, Handong Zhao, Zhe Lin, Ajinkya Kale, Zhangyang Wang, Tong Yu, Jiuxiang Gu, Sunav Choudhary, Xiaohui Xie

recommendation, and marketing services.

Causal Inference Contrastive Learning +3

Paper
Add Code

Towards Zero-shot Cross-lingual Image Retrieval and Tagging

2 code implementations • 15 Sep 2021 • Pranav Aggarwal, Ritiz Tambi, Ajinkya Kale

There has been a recent spike in interest in multi-modal Language and Vision problems.

Image Retrieval Retrieval

715

Paper
Code

Multimodal Contrastive Training for Visual Representation Learning

no code implementations • CVPR 2021 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta

We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.

Cross-Modal Retrieval Image Classification +6

Paper
Add Code

Towards Zero-shot Cross-lingual Image Retrieval

1 code implementation • 24 Nov 2020 • Pranav Aggarwal, Ajinkya Kale

There has been a recent spike in interest in multi-modal Language and Vision problems.

Image Retrieval Retrieval

Paper
Code

Multi-Modal Retrieval using Graph Neural Networks

no code implementations • 4 Oct 2020 • Aashish Kumar Misraa, Ajinkya Kale, Pranav Aggarwal, Ali Aminian

Most real world applications of image retrieval such as Adobe Stock, which is a marketplace for stock photography and illustrations, need a way for users to find images which are both visually (i. e. aesthetically) and conceptually (i. e. containing the same salient objects) as a query image.

Image Retrieval Re-Ranking +1

Paper
Add Code

Search Query Language Identification Using Weak Labeling

no code implementations • LREC 2020 • Ritiz Tambi, Ajinkya Kale, Tracy Holloway King

Language identification is a well-known task for natural language documents.

Language Identification

Paper
Add Code

Towards Semantic Query Segmentation

no code implementations • 25 Jul 2017 • Ajinkya Kale, Thrivikrama Taula, Sanjika Hewavitharana, Amit Srivastava

Query Segmentation is one of the critical components for understanding users' search intent in Information Retrieval tasks.

Information Retrieval Retrieval +1

Paper
Add Code

Visual Search at eBay

no code implementations • 10 Jun 2017 • Fan Yang, Ajinkya Kale, Yury Bubnov, Leon Stein, Qiaosong Wang, Hadi Kiapour, Robinson Piramuthu

We harness the availability of large image collection of eBay listings and state-of-the-art deep learning techniques to perform visual search at scale.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.