no code implementations • 19 Jan 2024 • Oliver Limoyo, Jimmy Li, Dmitriy Rivkin, Jonathan Kelly, Gregory Dudek
We leverage a visual language model (VLM) and an object detector to characterize the reference images via textual descriptions and then use a large language model (LLM) to retrieve relevant reference images based on a user's language query through text-based reasoning.
no code implementations • 4 Dec 2023 • Oliver Limoyo, Abhisek Konar, Trevor Ablett, Jonathan Kelly, Francois R. Hogan, Gregory Dudek
By doing so, the policy can generalize to object placement scenarios outside of the training environment without privileged information (e. g., placing a plate picked up from a table).
no code implementations • 2 Nov 2023 • Trevor Ablett, Oliver Limoyo, Adam Sigal, Affan Jilani, Jonathan Kelly, Kaleem Siddiqi, Francois Hogan, Gregory Dudek
An STS sensor can be switched between visual and tactile modes by leveraging a semi-transparent surface and controllable lighting, allowing for both pre-contact visual sensing and during-contact tactile sensing with a single sensor.
1 code implementation • 21 Apr 2022 • Oliver Limoyo, Trevor Ablett, Jonathan Kelly
In this work, we present a self-supervised generative modelling framework to jointly learn a probabilistic latent state representation of multimodal data and the respective dynamics.
1 code implementation • 18 Aug 2020 • Oliver Limoyo, Bryan Chan, Filip Marić, Brandon Wagstaff, Rupam Mahmood, Jonathan Kelly
Learning or identifying dynamics from a sequence of high-dimensional observations is a difficult challenge in many domains, including reinforcement learning and control.