no code implementations • 24 Apr 2024 • Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao
Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation.
1 code implementation • 19 Apr 2024 • Zeyu Ling, Bo Han, Yongkang Wongkan, Han Lin, Mohan Kankanhalli, Weidong Geng
Conditional human motion synthesis (HMS) aims to generate human motion sequences that conform to specific conditions.
Ranked #2 on Motion Synthesis on HumanML3D
no code implementations • 15 Apr 2024 • Han Lin, Jaemin Cho, Abhay Zala, Mohit Bansal
Ctrl-Adapter provides diverse capabilities including image control, video control, video control with sparse frames, multi-condition control, compatibility with different backbones, adaptation to unseen control conditions, and video editing.
no code implementations • 18 Mar 2024 • Abhay Zala, Jaemin Cho, Han Lin, Jaehong Yoon, Mohit Bansal
Instead of directly employing LLMs as agents, can we use LLMs' reasoning capabilities to adaptively create training environments to help smaller embodied RL agents learn useful skills that they are weak at?
no code implementations • 18 Oct 2023 • Abhay Zala, Han Lin, Jaemin Cho, Mohit Bansal
In the first stage, we use LLMs to generate and iteratively refine 'diagram plans' (in a planner-auditor feedback loop) which describe all the entities (objects and text labels), their relationships (arrows or lines), and their bounding box layouts.
no code implementations • 26 Sep 2023 • Han Lin, Abhay Zala, Jaemin Cho, Mohit Bansal
Our experiments demonstrate that VideoDirectorGPT framework substantially improves layout and movement control in both single- and multi-scene video generation and can generate multi-scene videos with visual consistency across scenes, while achieving competitive performance with SOTAs in open-domain single-scene T2V generation.
1 code implementation • CVPR 2023 • Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang
Vision Transformers (ViTs) emerge to achieve impressive performance on many data-abundant computer vision tasks by capturing long-range dependencies among local features.
1 code implementation • 2 Feb 2023 • Krzysztof Choromanski, Arijit Sehanobish, Han Lin, Yunfan Zhao, Eli Berger, Tetiana Parshakova, Alvin Pan, David Watkins, Tianyi Zhang, Valerii Likhosherstov, Somnath Basu Roy Chowdhury, Avinava Dubey, Deepali Jain, Tamas Sarlos, Snigdha Chaturvedi, Adrian Weller
We present two new classes of algorithms for efficient field integration on graphs encoding point clouds.
no code implementations • 11 Jan 2023 • Fatemeh Haghighi, Soumitra Ghosh, Hai Ngu, Sarah Chu, Han Lin, Mohsen Hejrati, Baris Bingol, Somaye Hashemifar
To this end, we propose an end-to-end deep learning framework based on self-supervised learning for the segmentation and quantification of dopaminergic neurons in PD animal models.
no code implementations • 19 Sep 2022 • Jingxi Xu, Han Lin, Shuran Song, Matei Ciocarlie
In this work, we propose TANDEM3D, a method that applies a co-training framework for exploration and decision making to 3D object recognition with tactile signals.
1 code implementation • ICLR 2022 • Krzysztof Choromanski, Haoxian Chen, Han Lin, Yuanzhe Ma, Arijit Sehanobish, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller
We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.
1 code implementation • 16 Jul 2021 • Krzysztof Choromanski, Han Lin, Haoxian Chen, Tianyi Zhang, Arijit Sehanobish, Valerii Likhosherstov, Jack Parker-Holder, Tamas Sarlos, Adrian Weller, Thomas Weingarten
In this paper we provide, to the best of our knowledge, the first comprehensive approach for incorporating various masking mechanisms into Transformers architectures in a scalable way.
no code implementations • 21 Jan 2021 • Xiangyun Zeng, XiaoFeng Wang, Ali Esamdin, Craig Pellegrino, WeiKang Zheng, Jujia Zhang, Jun Mo, Wenxiong Li, D. Andrew Howell, Alexei V. Filippenko, Han Lin, Thomas G. Brink, Edward A. Baron, Jamison Burke, James M. DerKacy, Curtis McCully, Daichi Hiramatsu, Griffin Hosseinzadeh, Benjamin T. Jeffers, Timothy W. Ross, Benjamin E. Stahl, Samantha Stegman, Stefano Valenti, Lifan Wang, Danfeng Xiang, Jicheng Zhang, Tianmeng Zhang
We present extensive, well-sampled optical and ultraviolet photometry and optical spectra of the Type Ia supernova (SN Ia) 2017hpa.
High Energy Astrophysical Phenomena Solar and Stellar Astrophysics
no code implementations • 21 Dec 2020 • Ji-Cheng Zhang, Xiao-Feng Wang, Jun Mo, Gao-Bo Xi, Jie Lin, Xiao-Jun Jiang, Xiao-Ming Zhang, Wen-Xiong Li, Sheng-Yu Yan, Zhi-Hao Chen, Lei Hu, Xue Li, Wei-Li Lin, Han Lin, Cheng Miao, Li-Ming Rui, Han-Na Sai, Dan-Feng Xiang, Xing-Han Zhang
The TMTS system can have a FoV of about 9 deg2 when monitoring the sky with two bands (i. e., SDSS g and r filters) at the same time, and a maximum FoV of ~18 deg2 when four telescopes monitor different sky areas in monochromatic filter mode.
Instrumentation and Methods for Astrophysics
no code implementations • NeurIPS 2020 • Han Lin, Haoxian Chen, Tianyi Zhang, Clement Laroche, Krzysztof Choromanski
Orthogonal Monte Carlo (OMC) is a very effective sampling algorithm imposing structural geometric conditions (orthogonality) on samples for variance reduction.