no code implementations • 6 Feb 2024 • Liang-Hsuan Tseng, En-Pei Hu, Cheng-Han Chiang, Yuan Tseng, Hung-Yi Lee, Lin-shan Lee, Shao-Hua Sun
A word/phoneme in the speech signal is represented by a segment of speech signal with variable length and unknown boundary, and this segmental structure makes learning the mapping between speech and text challenging, especially without paired data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 27 Nov 2023 • Yu-an Lin, Chen-Tao Lee, Guan-Ting Liu, Pu-Jen Cheng, Shao-Hua Sun
On the other hand, representing RL policies using state machines (Inala et al., 2020) can inductively generalize to long-horizon tasks; however, it struggles to scale up to acquire diverse and complex behaviors.
1 code implementation • 23 Oct 2023 • Nicholas Collin Suwono, Justin Chih-Yao Chen, Tun Min Hung, Ting-Hao Kenneth Huang, I-Bin Liao, Yung-Hui Li, Lun-Wei Ku, Shao-Hua Sun
This work introduces a novel task, location-aware visual question generation (LocaVQG), which aims to generate engaging questions from data relevant to a particular geographical location.
no code implementations • 16 Oct 2023 • Jesse Zhang, Jiahui Zhang, Karl Pertsch, Ziyi Liu, Xiang Ren, Minsuk Chang, Shao-Hua Sun, Joseph J. Lim
Instead, our approach BOSS (BOotStrapping your own Skills) learns to accomplish new tasks by performing "skill bootstrapping," where an agent with a set of primitive skills interacts with the environment to practice new skills without receiving reward feedback for tasks outside of the initial skill set.
no code implementations • 12 Oct 2023 • Po-Chen Ko, Jiayuan Mao, Yilun Du, Shao-Hua Sun, Joshua B. Tenenbaum
In this work, we present an approach to construct a video-based robot policy capable of reliably executing diverse tasks across different robots and environments from few video demonstrations without using any action annotations.
no code implementations • 9 Mar 2023 • Linghan Zhong, Ryan Lindeborg, Jesse Zhang, Joseph J. Lim, Shao-Hua Sun
Then, we train a high-level module to comprehend the task specification (e. g., input/output pairs or demonstrations) from long programs and produce a sequence of task embeddings, which are then decoded by the program decoder and composed to yield the synthesized program.
no code implementations • 26 Feb 2023 • Hsiang-Chun Wang, Shang-Fu Chen, Ming-Hao Hsu, Chun-Mao Lai, Shao-Hua Sun
Most existing imitation learning methods that do not require interacting with environments either model the expert distribution as the conditional probability p(a|s) (e. g., behavioral cloning, BC) or the joint probability p(s, a).
no code implementations • 1 Feb 2023 • Grace Zhang, Ayush Jain, Injune Hwang, Shao-Hua Sun, Joseph J. Lim
The ability to leverage shared behaviors between tasks is critical for sample-efficient multi-task reinforcement learning (MTRL).
no code implementations • 30 Jan 2023 • Guan-Ting Liu, En-Pei Hu, Pu-Jen Cheng, Hung-Yi Lee, Shao-Hua Sun
Aiming to produce reinforcement learning (RL) policies that are human-interpretable and can generalize better to novel scenarios, Trivedi et al. (2021) present a method (LEAPS) that first learns a program embedding space to continuously parameterize diverse programs from a pre-generated program dataset, and then searches for a task-solving program in the learned program embedding space when given a task.
no code implementations • ICLR 2022 • Taewook Nam, Shao-Hua Sun, Karl Pertsch, Sung Ju Hwang, Joseph J Lim
While deep reinforcement learning methods have shown impressive results in robot learning, their sample inefficiency makes the learning of complex, long-horizon behaviors with real robot systems infeasible.
no code implementations • NeurIPS 2021 • Youngwoon Lee, Andrew Szot, Shao-Hua Sun, Joseph J. Lim
Task progress is intuitive and readily available task information that can guide an agent closer to the desired goal.
1 code implementation • NeurIPS 2021 • Dweep Trivedi, Jesse Zhang, Shao-Hua Sun, Joseph J. Lim
To alleviate the difficulty of learning to compose programs to induce the desired agent behavior from scratch, we propose to first learn a program embedding space that continuously parameterizes diverse behaviors in an unsupervised manner and then search over the learned program embedding space to yield a program that maximizes the return for a given task.
no code implementations • 1 Jan 2021 • Andrew Szot, Youngwoon Lee, Shao-Hua Sun, Joseph J Lim
Humans can effectively learn to estimate how close they are to completing a desired task simply by watching others fulfill the task.
no code implementations • ICLR 2020 • Shao-Hua Sun, Te-Lin Wu, Joseph J. Lim
Developing agents that can learn to follow natural language instructions has been an emerging research direction.
2 code implementations • NeurIPS 2019 • Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim
Model-agnostic meta-learners aim to acquire meta-learned parameters from similar tasks to adapt to novel tasks from the same distribution with few gradient updates.
no code implementations • 18 Dec 2018 • Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim
One important limitation of such frameworks is that they seek a common initialization shared across the entire task distribution, substantially limiting the diversity of the task distributions that they are able to learn from.
1 code implementation • Proceedings of the 15th European Conference on Computer Vision, 2018 • Shao-Hua Sun, Minyoung Huh, Yuan-Hong Liao, Ning Zhang, Joseph J. Lim
We address the task of multi-view novel view synthesis, where we are interested in synthesizing a target image with an arbitrary camera pose from given source images.
Ranked #1 on Novel View Synthesis on Synthia Novel View Synthesis
no code implementations • 27 Sep 2018 • Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim
In this paper, we augment MAML with the capability to identify tasks sampled from a multimodal task distribution and adapt quickly through gradient updates.
1 code implementation • ICML 2018 • Shao-Hua Sun, Hyeonwoo Noh, Sriram Somasundaram, Joseph Lim
To empower machines with this ability, we propose a neural program synthesizer that is able to explicitly synthesize underlying programs from behaviorally diverse and visually complicated demonstration videos.