1 code implementation • 26 Apr 2024 • Xuri Ge, Songpei Xu, Fuhai Chen, Jie Wang, Guoxin Wang, Shan An, Joemon M. Jose
In this paper, we propose a novel visual Semantic-Spatial Self-Highlighting Network (termed 3SHNet) for high-precision, high-efficiency and high-generalization image-sentence retrieval.
Ranked #1 on Cross-Modal Retrieval on MSCOCO
2 code implementations • 2 Apr 2024 • Junchen Fu, Xuri Ge, Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, Jie Wang, Joemon M. Jose
This is also a notable improvement over the Adapter and LoRA, which require 37-39 GB GPU memory and 350-380 seconds per epoch for training.
no code implementations • 25 Mar 2024 • Jie Wang, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M. Jose
The LE is learned from a subset of user-item interaction data, thus reducing the need for large training data, and can synthesise user feedback for offline data by: (i) acting as a state model that produces high quality states that enrich the user representation, and (ii) functioning as a reward model to accurately capture nuanced user preferences on actions.
no code implementations • 17 Oct 2022 • Xuri Ge, Fuhai Chen, Songpei Xu, Fuxiang Tao, Joemon M. Jose
To correlate the context of objects with the textual context, we further refine the visual semantic representation via the cross-level object-sentence and word-image based interactive attention.
no code implementations • 13 Jun 2022 • Jie Wang, Fajie Yuan, Mingyue Cheng, Joemon M. Jose, Chenyun Yu, Beibei Kong, Xiangnan He, Zhijin Wang, Bo Hu, Zang Li
That is, the users and the interacted items are represented by their unique IDs, which are generally not shareable across different systems or platforms.
no code implementations • 4 Apr 2022 • Xuri Ge, Joemon M. Jose, Songpei Xu, Xiao Liu, Hu Han
While the region-level feature learning from local face patches features via graph neural network can encode the correlation across different AUs, the pixel-wise and channel-wise feature learning via graph attention network can enhance the discrimination ability of AU features from global face features.
no code implementations • 3 Mar 2022 • Xuri Ge, Joemon M. Jose, Pengcheng Wang, Arunachalam Iyer, Xiao Liu, Hu Han
In this paper, we propose a novel Adaptive Local-Global Relational Network (ALGRNet) for facial AU detection and use it to classify facial paralysis severity.
no code implementations • 5 Nov 2021 • Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M. Jose
However, the direct use of RL algorithms in the RS setting is impractical due to challenges like off-policy training, huge action spaces and lack of sufficient reward signals.
no code implementations • 5 Aug 2021 • Xuri Ge, Fuhai Chen, Joemon M. Jose, Zhilong Ji, Zhongqin Wu, Xiao Liu
In this work, we propose to address the above issue from two aspects: (i) constructing intrinsic structure (along with relations) among the fragments of respective modalities, e. g., "dog $\to$ play $\to$ ball" in semantic structure for an image, and (ii) seeking explicit inter-modal structural and semantic correspondence between the visual and textual modalities.
no code implementations • 10 Jun 2020 • Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M. Jose
A major component of RL approaches is to train the agent through interactions with the environment.
1 code implementation • 9 Apr 2020 • Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M. Jose
Graph Convolution Networks (GCN) are widely used in learning graph representations due to their effectiveness and efficiency.
3 code implementations • 15 Aug 2018 • Fajie Yuan, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M. Jose, Xiangnan He
Convolutional Neural Networks (CNNs) have been recently introduced in the domain of session-based next item recommendation.
no code implementations • ACL 2018 • Xin Xin, Fajie Yuan, Xiangnan He, Joemon M. Jose
Stochastic Gradient Descent (SGD) with negative sampling is the most prevalent approach to learn word representations.
no code implementations • 26 Oct 2017 • Long Chen, Fajie Yuan, Joemon M. Jose, Wei-Nan Zhang
Although the word-popularity based negative sampler has shown superb performance in the skip-gram model, the theoretical motivation behind oversampling popular (non-observed) words as negative samples is still not well understood.