Search Results for author: Xiaowen Qiu

3D-VLA: A 3D Vision-Language-Action Generative World Model

Recent vision-language-action (VLA) models rely on 2D inputs, lacking integration with the broader realm of the 3D physical world.

Paper
Add Code

The style removal network removes the original image styles, and the style restoration network recovers image styles in a supervised manner.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.