AdsCVLR: Commercial Visual-Linguistic Representation Modeling in Sponsored Search

Sponsored search advertisements (ads) appear next to search results when consumers look for products and services on search engines. As the fundamental basis of search ads, relevance modeling has attracted increasing attention due to the significant research challenges and tremendous practical value. In this paper, we address the problem of multi-modal modeling in sponsored search, which models the relevance between user query and commercial ads with multi-modal structured information. To solve this problem, we propose a transformer architecture with Ads data on Commercial Visual-Linguistic Representation (AdsCVLR) with contrastive learning that naturally extends the transformer encoder with the complementary multi-modal inputs, serving as a strong aggregator of image-text features. We also make a public advertising dataset, which includes 480K labeled query-ad pairwise data with structured information of image, title, seller, description, and so on. Empirically, we evaluate the AdsCVLR model over the large industry dataset, and the experimental results of online/offline tests show the superiority of our method.

PDF Abstract

Datasets


Introduced in the Paper:

CommercialAdsDataset
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Image-text matching CommercialAdsDataset AdsCVLR ADD(S) AUC 87.90 # 3

Methods


No methods listed for this paper. Add relevant methods here