no code implementations • 27 Nov 2023 • Yifei Chen, Dapeng Chen, Ruijin Liu, Sai Zhou, Wenyuan Xue, Wei Peng
With the aligned entities, we feed their text embeddings to a transformer-based video adapter as the queries, which can help extract the semantics of the most important entities from a video to a vector.
no code implementations • 15 Aug 2023 • Wenyuan Xue, Dapeng Chen, Baosheng Yu, Yifei Chen, Sai Zhou, Wei Peng
Visual chart recognition systems are gaining increasing attention due to the growing demand for automatically identifying table headers and values from chart images.