1 code implementation • NAACL 2021 • Ke-Jyun Wang, Yun-Hsuan Liu, Hung-Ting Su, Jen-Wei Wang, Yu-Siang Wang, Winston H. Hsu, Wen-Chin Chen
To effectively apply robots in working environments and assist humans, it is essential to develop and evaluate how visual grounding (VG) can affect machine performance on occluded objects.
1 code implementation • 19 Jan 2021 • Chen-Hsi Chang, Hung-Ting Su, Jui-heng Hsu, Yu-Siang Wang, Yu-Cheng Chang, Zhe Yu Liu, Ya-Liang Chang, Wen-Feng Cheng, Ke-Jyun Wang, Winston H. Hsu
Experimental result demonstrates that modern models including BERT contextual embedding, movie tag prediction systems, and relational networks, perform at most 37% of human performance (23. 97/64. 87) in terms of F1 score.
1 code implementation • 5 Jan 2021 • Hung-Ting Su, Chen-Hsi Chang, Po-Wei Shen, Yu-Siang Wang, Ya-Liang Chang, Yu-Cheng Chang, Pu-Jen Cheng, Winston H. Hsu
Furthermore, using our generated QA pairs only on the Video QA task, we can surpass some supervised baselines.
1 code implementation • 11 Mar 2020 • Yu-Sheng Lin, Zhe-Yu Liu, Yu-An Chen, Yu-Siang Wang, Ya-Liang Chang, Winston H. Hsu
We study the XAI (explainable AI) on the face recognition task, particularly the face verification here.
Explainable Artificial Intelligence (XAI) Face Recognition +1
no code implementations • 8 Mar 2020 • Yu-Siang Wang, Yen-Ling Kuo, Boris Katz
We evaluate our look-ahead module on three datasets of varying difficulties: IM2LATEX-100k OCR image to LaTeX, WMT16 multimodal machine translation, and WMT14 machine translation.
Multimodal Machine Translation Optical Character Recognition (OCR) +2
no code implementations • 5 Jul 2019 • Yu-Siang Wang, Hung-Ting Su, Chen-Hsi Chang, Zhe-Yu Liu, Winston H. Hsu
We introduce a novel task, Video Question Generation (Video QG).
2 code implementations • NAACL 2018 • Yu-Siang Wang, Chenxi Liu, Xiaohui Zeng, Alan Yuille
The scene graphs generated by our learned neural dependency parser achieve an F-score similarity of 49. 67% to ground truth graphs on our evaluation set, surpassing best previous approaches by 5%.
no code implementations • CVPR 2019 • Xiaohui Zeng, Chenxi Liu, Yu-Siang Wang, Weichao Qiu, Lingxi Xie, Yu-Wing Tai, Chi Keung Tang, Alan L. Yuille
Though image-space adversaries can be interpreted as per-pixel albedo change, we verify that they cannot be well explained along these physically meaningful dimensions, which often have a non-local effect.
1 code implementation • EMNLP 2017 • Peng-Hsuan Li, Ruo-Ping Dong, Yu-Siang Wang, Ju-chieh Chou, Wei-Yun Ma
Motivated by the observation that named entities are highly related to linguistic constituents, we propose a constituent-based BRNN-CNN for named entity recognition.