Search Results for author: Fei Xia

Found 118 papers, 46 papers with code

VPAI_Lab at MedVidQA 2022: A Two-Stage Cross-modal Fusion Method for Medical Instructional Video Classification

1 code implementation • BioNLP (ACL) 2022 • Bin Li, Yixuan Weng, Fei Xia, Bin Sun, Shutao Li

Given an input video, the MedVidCL task aims to correctly classify it into one of three following categories: Medical Instructional, Medical Non-instructional, and Non-medical.

Video Classification

Paper
Code

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes

1 code implementation • ECCV 2020 • Panos Achlioptas, Ahmed Abdelreheem, Fei Xia, Mohamed Elhoseiny, Leonidas Guibas

Due to the scarcity and unsuitability of existent 3D-oriented linguistic resources for this task, we first develop two large-scale and complementary visio-linguistic datasets: i) extbf{ extit{Sr3D}}, which contains 83. 5K template-based utterances leveraging extit{spatial relations} with other fine-grained object classes to localize a referred object in a given scene, and ii) extbf{ extit{Nr3D}} which contains 41. 5K extit{natural, free-form}, utterances collected by deploying a 2-player object reference game in 3D scenes.

Object

Paper
Code

Improving Relation Extraction through Syntax-induced Pre-training with Dependency Masking

1 code implementation • Findings (ACL) 2022 • Yuanhe Tian, Yan Song, Fei Xia

Relation extraction (RE) is an important natural language processing task that predicts the relation between two given entities, where a good understanding of the contextual information is essential to achieve an outstanding model performance.

Ranked #11 on Relation Extraction on SemEval-2010 Task-8

Relation Relation Extraction +1

Paper
Code

ChiMST: A Chinese Medical Corpus for Word Segmentation and Medical Term Recognition

1 code implementation • LREC 2022 • Yuanhe Tian, Han Qin, Fei Xia, Yan Song

Chinese word segmentation (CWS) and named entity recognition (NER) are two important tasks in Chinese natural language processing.

Chinese Word Segmentation named-entity-recognition +2

Paper
Code

A Knowledge storage and semantic space alignment Method for Multi-documents dialogue generation

no code implementations • dialdoc (ACL) 2022 • Minjun Zhu, Bin Li, Yixuan Weng, Fei Xia

Question Answering (QA) is a Natural Language Processing (NLP) task that can measure language and semantics understanding ability, it requires a system not only to retrieve relevant documents from a large number of articles but also to answer corresponding questions according to documents.

Dialogue Generation Language Modelling +3

Paper
Add Code

LingJing at SemEval-2022 Task 1: Multi-task Self-supervised Pre-training for Multilingual Reverse Dictionary

2 code implementations • SemEval (NAACL) 2022 • Bin Li, Yixuan Weng, Fei Xia, Shizhu He, Bin Sun, Shutao Li

This paper introduces the approach of Team LingJing’s experiments on SemEval-2022 Task 1 Comparing Dictionaries and Word Embeddings (CODWOE).

Reverse Dictionary Word Embeddings

Paper
Code

LingJing at SemEval-2022 Task 3: Applying DeBERTa to Lexical-level Presupposed Relation Taxonomy with Knowledge Transfer

1 code implementation • SemEval (NAACL) 2022 • Fei Xia, Bin Li, Yixuan Weng, Shizhu He, Bin Sun, Shutao Li, Kang Liu, Jun Zhao

For the classification sub-task, we adopt the DeBERTa-v3 pre-trained model for fine-tuning datasets of different languages.

Binary Classification Classification +2

Paper
Code

Enhancing Structure-aware Encoder with Extremely Limited Data for Graph-based Dependency Parsing

1 code implementation • COLING 2022 • Yuanhe Tian, Yan Song, Fei Xia

Dependency parsing is an important fundamental natural language processing task which analyzes the syntactic structure of an input sentence by illustrating the syntactic relations between words.

Ranked #2 on Dependency Parsing on Penn Treebank

2k Dependency Parsing +1

Paper
Code

Syntax-driven Approach for Semantic Role Labeling

1 code implementation • LREC 2022 • Yuanhe Tian, Han Qin, Fei Xia, Yan Song

To achieve a better performance in SRL, a model is always required to have a good understanding of the context information.

Ranked #2 on Semantic Role Labeling on CoNLL 2005

POS Semantic Role Labeling +1

Paper
Code

Complementary Learning of Aspect Terms for Aspect-based Sentiment Analysis

1 code implementation • LREC 2022 • Han Qin, Yuanhe Tian, Fei Xia, Yan Song

Aspect-based sentiment analysis (ABSA) aims to predict the sentiment polarity towards a given aspect term in a sentence on the fine-grained level, which usually requires a good understanding of contextual information, especially appropriately distinguishing of a given aspect and its contexts, to achieve good performance.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Paper
Code

GenCHiP: Generating Robot Policy Code for High-Precision and Contact-Rich Manipulation Tasks

no code implementations • 9 Apr 2024 • Kaylee Burns, Ajinkya Jain, Keegan Go, Fei Xia, Michael Stark, Stefan Schaal, Karol Hausman

Large Language Models (LLMs) have been successful at generating robot policy code, but so far these results have been limited to high-level tasks that do not require precise movement.

Paper
Add Code

Extracting Social Determinants of Health from Pediatric Patient Notes Using Large Language Models: Novel Corpus and Methods

1 code implementation • 31 Mar 2024 • Yujuan Fu, Giridhar Kaushik Ramachandran, Nicholas J Dobbins, Namu Park, Michael Leu, Abby R. Rosenberg, Kevin Lybarger, Fei Xia, Ozlem Uzuner, Meliha Yetisgen

In this work, we present a novel annotated corpus, the Pediatric Social History Annotation Corpus (PedSHAC), and evaluate the automatic extraction of detailed SDoH representations using fine-tuned and in-context learning methods with Large Language Models (LLMs).

In-Context Learning

Paper
Code

MicroDiffusion: Implicit Representation-Guided Diffusion for 3D Reconstruction from Limited 2D Microscopy Projections

1 code implementation • 16 Mar 2024 • Mude Hui, Zihao Wei, Hongru Zhu, Fei Xia, Yuyin Zhou

This strategy enriches the diffusion process with structured 3D information, enhancing detail and reducing noise in localized 2D images.

3D Reconstruction Denoising

Paper
Code

BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation

no code implementations • 14 Mar 2024 • Chengshu Li, Ruohan Zhang, Josiah Wong, Cem Gokmen, Sanjana Srivastava, Roberto Martín-Martín, Chen Wang, Gabrael Levine, Wensi Ai, Benjamin Martinez, Hang Yin, Michael Lingelbach, Minjune Hwang, Ayano Hiranaka, Sujay Garlanka, Arman Aydin, Sharon Lee, Jiankai Sun, Mona Anvari, Manasi Sharma, Dhruva Bansal, Samuel Hunter, Kyu-Young Kim, Alan Lou, Caleb R Matthews, Ivan Villa-Renteria, Jerry Huayang Tang, Claire Tang, Fei Xia, Yunzhu Li, Silvio Savarese, Hyowon Gweon, C. Karen Liu, Jiajun Wu, Li Fei-Fei

We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered robotics.

Paper
Add Code

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

no code implementations • 8 Mar 2024 • Gemini Team, Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry, Lepikhin, Timothy Lillicrap, Jean-Baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy, Jilin Chen, Michael Isard, Paul Barham, Tom Hennigan, Ross Mcilroy, Melvin Johnson, Johan Schalkwyk, Eli Collins, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Clemens Meyer, Gregory Thornton, Zhen Yang, Henryk Michalewski, Zaheer Abbas, Nathan Schucher, Ankesh Anand, Richard Ives, James Keeling, Karel Lenc, Salem Haykal, Siamak Shakeri, Pranav Shyam, Aakanksha Chowdhery, Roman Ring, Stephen Spencer, Eren Sezener, Luke Vilnis, Oscar Chang, Nobuyuki Morioka, George Tucker, Ce Zheng, Oliver Woodman, Nithya Attaluri, Tomas Kocisky, Evgenii Eltyshev, Xi Chen, Timothy Chung, Vittorio Selo, Siddhartha Brahma, Petko Georgiev, Ambrose Slone, Zhenkai Zhu, James Lottes, Siyuan Qiao, Ben Caine, Sebastian Riedel, Alex Tomala, Martin Chadwick, Juliette Love, Peter Choy, Sid Mittal, Neil Houlsby, Yunhao Tang, Matthew Lamm, Libin Bai, Qiao Zhang, Luheng He, Yong Cheng, Peter Humphreys, Yujia Li, Sergey Brin, Albin Cassirer, Yingjie Miao, Lukas Zilka, Taylor Tobin, Kelvin Xu, Lev Proleev, Daniel Sohn, Alberto Magni, Lisa Anne Hendricks, Isabel Gao, Santiago Ontanon, Oskar Bunyan, Nathan Byrd, Abhanshu Sharma, Biao Zhang, Mario Pinto, Rishika Sinha, Harsh Mehta, Dawei Jia, Sergi Caelles, Albert Webson, Alex Morris, Becca Roelofs, Yifan Ding, Robin Strudel, Xuehan Xiong, Marvin Ritter, Mostafa Dehghani, Rahma Chaabouni, Abhijit Karmarkar, Guangda Lai, Fabian Mentzer, Bibo Xu, Yaguang Li, Yujing Zhang, Tom Le Paine, Alex Goldin, Behnam Neyshabur, Kate Baumli, Anselm Levskaya, Michael Laskin, Wenhao Jia, Jack W. Rae, Kefan Xiao, Antoine He, Skye Giordano, Lakshman Yagati, Jean-Baptiste Lespiau, Paul Natsev, Sanjay Ganapathy, Fangyu Liu, Danilo Martins, Nanxin Chen, Yunhan Xu, Megan Barnes, Rhys May, Arpi Vezer, Junhyuk Oh, Ken Franko, Sophie Bridgers, Ruizhe Zhao, Boxi Wu, Basil Mustafa, Sean Sechrist, Emilio Parisotto, Thanumalayan Sankaranarayana Pillai, Chris Larkin, Chenjie Gu, Christina Sorokin, Maxim Krikun, Alexey Guseynov, Jessica Landon, Romina Datta, Alexander Pritzel, Phoebe Thacker, Fan Yang, Kevin Hui, Anja Hauth, Chih-Kuan Yeh, David Barker, Justin Mao-Jones, Sophia Austin, Hannah Sheahan, Parker Schuh, James Svensson, Rohan Jain, Vinay Ramasesh, Anton Briukhov, Da-Woon Chung, Tamara von Glehn, Christina Butterfield, Priya Jhakra, Matthew Wiethoff, Justin Frye, Jordan Grimstad, Beer Changpinyo, Charline Le Lan, Anna Bortsova, Yonghui Wu, Paul Voigtlaender, Tara Sainath, Shane Gu, Charlotte Smith, Will Hawkins, Kris Cao, James Besley, Srivatsan Srinivasan, Mark Omernick, Colin Gaffney, Gabriela Surita, Ryan Burnell, Bogdan Damoc, Junwhan Ahn, Andrew Brock, Mantas Pajarskas, Anastasia Petrushkina, Seb Noury, Lorenzo Blanco, Kevin Swersky, Arun Ahuja, Thi Avrahami, Vedant Misra, Raoul de Liedekerke, Mariko Iinuma, Alex Polozov, Sarah York, George van den Driessche, Paul Michel, Justin Chiu, Rory Blevins, Zach Gleicher, Adrià Recasens, Alban Rrustemi, Elena Gribovskaya, Aurko Roy, Wiktor Gworek, Sébastien M. R. Arnold, Lisa Lee, James Lee-Thorp, Marcello Maggioni, Enrique Piqueras, Kartikeya Badola, Sharad Vikram, Lucas Gonzalez, Anirudh Baddepudi, Evan Senter, Jacob Devlin, James Qin, Michael Azzam, Maja Trebacz, Martin Polacek, Kashyap Krishnakumar, Shuo-Yiin Chang, Matthew Tung, Ivo Penchev, Rishabh Joshi, Kate Olszewska, Carrie Muir, Mateo Wirth, Ale Jakse Hartman, Josh Newlan, Sheleem Kashem, Vijay Bolina, Elahe Dabir, Joost van Amersfoort, Zafarali Ahmed, James Cobon-Kerr, Aishwarya Kamath, Arnar Mar Hrafnkelsson, Le Hou, Ian Mackinnon, Alexandre Frechette, Eric Noland, Xiance Si, Emanuel Taropa, Dong Li, Phil Crone, Anmol Gulati, Sébastien Cevey, Jonas Adler, Ada Ma, David Silver, Simon Tokumine, Richard Powell, Stephan Lee, Kiran Vodrahalli, Samer Hassan, Diana Mincu, Antoine Yang, Nir Levine, Jenny Brennan, Mingqiu Wang, Sarah Hodkinson, Jeffrey Zhao, Josh Lipschultz, Aedan Pope, Michael B. Chang, Cheng Li, Laurent El Shafey, Michela Paganini, Sholto Douglas, Bernd Bohnet, Fabio Pardo, Seth Odoom, Mihaela Rosca, Cicero Nogueira dos santos, Kedar Soparkar, Arthur Guez, Tom Hudson, Steven Hansen, Chulayuth Asawaroengchai, Ravi Addanki, Tianhe Yu, Wojciech Stokowiec, Mina Khan, Justin Gilmer, Jaehoon Lee, Carrie Grimes Bostock, Keran Rong, Jonathan Caton, Pedram Pejman, Filip Pavetic, Geoff Brown, Vivek Sharma, Mario Lučić, Rajkumar Samuel, Josip Djolonga, Amol Mandhane, Lars Lowe Sjösund, Elena Buchatskaya, Elspeth White, Natalie Clay, Jiepu Jiang, Hyeontaek Lim, Ross Hemsley, Zeyncep Cankara, Jane Labanowski, Nicola De Cao, David Steiner, Sayed Hadi Hashemi, Jacob Austin, Anita Gergely, Tim Blyth, Joe Stanton, Kaushik Shivakumar, Aditya Siddhant, Anders Andreassen, Carlos Araya, Nikhil Sethi, Rakesh Shivanna, Steven Hand, Ankur Bapna, Ali Khodaei, Antoine Miech, Garrett Tanzer, Andy Swing, Shantanu Thakoor, Lora Aroyo, Zhufeng Pan, Zachary Nado, Jakub Sygnowski, Stephanie Winkler, Dian Yu, Mohammad Saleh, Loren Maggiore, Yamini Bansal, Xavier Garcia, Mehran Kazemi, Piyush Patil, Ishita Dasgupta, Iain Barr, Minh Giang, Thais Kagohara, Ivo Danihelka, Amit Marathe, Vladimir Feinberg, Mohamed Elhawaty, Nimesh Ghelani, Dan Horgan, Helen Miller, Lexi Walker, Richard Tanburn, Mukarram Tariq, Disha Shrivastava, Fei Xia, Qingze Wang, Chung-Cheng Chiu, Zoe Ashwood, Khuslen Baatarsukh, Sina Samangooei, Raphaël Lopez Kaufman, Fred Alcober, Axel Stjerngren, Paul Komarek, Katerina Tsihlas, Anudhyan Boral, Ramona Comanescu, Jeremy Chen, Ruibo Liu, Chris Welty, Dawn Bloxwich, Charlie Chen, Yanhua Sun, Fangxiaoyu Feng, Matthew Mauger, Xerxes Dotiwalla, Vincent Hellendoorn, Michael Sharman, Ivy Zheng, Krishna Haridasan, Gabe Barth-Maron, Craig Swanson, Dominika Rogozińska, Alek Andreev, Paul Kishan Rubenstein, Ruoxin Sang, Dan Hurt, Gamaleldin Elsayed, Renshen Wang, Dave Lacey, Anastasija Ilić, Yao Zhao, Adam Iwanicki, Alejandro Lince, Alexander Chen, Christina Lyu, Carl Lebsack, Jordan Griffith, Meenu Gaba, Paramjit Sandhu, Phil Chen, Anna Koop, Ravi Rajwar, Soheil Hassas Yeganeh, Solomon Chang, Rui Zhu, Soroush Radpour, Elnaz Davoodi, Ving Ian Lei, Yang Xu, Daniel Toyama, Constant Segal, Martin Wicke, Hanzhao Lin, Anna Bulanova, Adrià Puigdomènech Badia, Nemanja Rakićević, Pablo Sprechmann, Angelos Filos, Shaobo Hou, Víctor Campos, Nora Kassner, Devendra Sachan, Meire Fortunato, Chimezie Iwuanyanwu, Vitaly Nikolaev, Balaji Lakshminarayanan, Sadegh Jazayeri, Mani Varadarajan, Chetan Tekur, Doug Fritz, Misha Khalman, David Reitter, Kingshuk Dasgupta, Shourya Sarcar, Tina Ornduff, Javier Snaider, Fantine Huot, Johnson Jia, Rupert Kemp, Nejc Trdin, Anitha Vijayakumar, Lucy Kim, Christof Angermueller, Li Lao, Tianqi Liu, Haibin Zhang, David Engel, Somer Greene, Anaïs White, Jessica Austin, Lilly Taylor, Shereen Ashraf, Dangyi Liu, Maria Georgaki, Irene Cai, Yana Kulizhskaya, Sonam Goenka, Brennan Saeta, Ying Xu, Christian Frank, Dario de Cesare, Brona Robenek, Harry Richardson, Mahmoud Alnahlawi, Christopher Yew, Priya Ponnapalli, Marco Tagliasacchi, Alex Korchemniy, Yelin Kim, Dinghua Li, Bill Rosgen, Kyle Levin, Jeremy Wiesner, Praseem Banzal, Praveen Srinivasan, Hongkun Yu, Çağlar Ünlü, David Reid, Zora Tung, Daniel Finchelstein, Ravin Kumar, Andre Elisseeff, Jin Huang, Ming Zhang, Ricardo Aguilar, Mai Giménez, Jiawei Xia, Olivier Dousse, Willi Gierke, Damion Yates, Komal Jalan, Lu Li, Eri Latorre-Chimoto, Duc Dung Nguyen, Ken Durden, Praveen Kallakuri, Yaxin Liu, Matthew Johnson, Tomy Tsai, Alice Talbert, Jasmine Liu, Alexander Neitz, Chen Elkind, Marco Selvi, Mimi Jasarevic, Livio Baldini Soares, Albert Cui, Pidong Wang, Alek Wenjiao Wang, Xinyu Ye, Krystal Kallarackal, Lucia Loher, Hoi Lam, Josef Broder, Dan Holtmann-Rice, Nina Martin, Bramandia Ramadhana, Mrinal Shukla, Sujoy Basu, Abhi Mohan, Nick Fernando, Noah Fiedel, Kim Paterson, Hui Li, Ankush Garg, Jane Park, DongHyun Choi, Diane Wu, Sankalp Singh, Zhishuai Zhang, Amir Globerson, Lily Yu, John Carpenter, Félix de Chaumont Quitry, Carey Radebaugh, Chu-Cheng Lin, Alex Tudor, Prakash Shroff, Drew Garmon, Dayou Du, Neera Vats, Han Lu, Shariq Iqbal, Alex Yakubovich, Nilesh Tripuraneni, James Manyika, Haroon Qureshi, Nan Hua, Christel Ngani, Maria Abi Raad, Hannah Forbes, Jeff Stanway, Mukund Sundararajan, Victor Ungureanu, Colton Bishop, Yunjie Li, Balaji Venkatraman, Bo Li, Chloe Thornton, Salvatore Scellato, Nishesh Gupta, Yicheng Wang, Ian Tenney, Xihui Wu, Ashish Shenoy, Gabriel Carvajal, Diana Gage Wright, Ben Bariach, Zhuyun Xiao, Peter Hawkins, Sid Dalmia, Clement Farabet, Pedro Valenzuela, Quan Yuan, Ananth Agarwal, Mia Chen, Wooyeol Kim, Brice Hulse, Nandita Dukkipati, Adam Paszke, Andrew Bolt, Kiam Choo, Jennifer Beattie, Jennifer Prendki, Harsha Vashisht, Rebeca Santamaria-Fernandez, Luis C. Cobo, Jarek Wilkiewicz, David Madras, Ali Elqursh, Grant Uy, Kevin Ramirez, Matt Harvey, Tyler Liechty, Heiga Zen, Jeff Seibert, Clara Huiyi Hu, Andrey Khorlin, Maigo Le, Asaf Aharoni, Megan Li, Lily Wang, Sandeep Kumar, Norman Casagrande, Jay Hoover, Dalia El Badawy, David Soergel, Denis Vnukov, Matt Miecnikowski, Jiri Simsa, Praveen Kumar, Thibault Sellam, Daniel Vlasic, Samira Daruki, Nir Shabat, John Zhang, Guolong Su, Jiageng Zhang, Jeremiah Liu, Yi Sun, Evan Palmer, Alireza Ghaffarkhah, Xi Xiong, Victor Cotruta, Michael Fink, Lucas Dixon, Ashwin Sreevatsa, Adrian Goedeckemeyer, Alek Dimitriev, Mohsen Jafari, Remi Crocker, Nicholas FitzGerald, Aviral Kumar, Sanjay Ghemawat, Ivan Philips, Frederick Liu, Yannie Liang, Rachel Sterneck, Alena Repina, Marcus Wu, Laura Knight, Marin Georgiev, Hyo Lee, Harry Askham, Abhishek Chakladar, Annie Louis, Carl Crous, Hardie Cate, Dessie Petrova, MICHAEL QUINN, Denese Owusu-Afriyie, Achintya Singhal, Nan Wei, Solomon Kim, Damien Vincent, Milad Nasr, Christopher A. Choquette-Choo, Reiko Tojo, Shawn Lu, Diego de Las Casas, Yuchung Cheng, Tolga Bolukbasi, Katherine Lee, Saaber Fatehi, Rajagopal Ananthanarayanan, Miteyan Patel, Charbel Kaed, Jing Li, Shreyas Rammohan Belle, Zhe Chen, Jaclyn Konzelmann, Siim Põder, Roopal Garg, Vinod Koverkathu, Adam Brown, Chris Dyer, Rosanne Liu, Azade Nova, Jun Xu, Alanna Walton, Alicia Parrish, Mark Epstein, Sara McCarthy, Slav Petrov, Demis Hassabis, Koray Kavukcuoglu, Jeffrey Dean, Oriol Vinyals

In this report, we present the latest model of the Gemini family, Gemini 1. 5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

Ranked #20 on Code Generation on HumanEval

Code Generation Retrieval

Paper
Add Code

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

no code implementations • 12 Feb 2024 • Soroush Nasiriany, Fei Xia, Wenhao Yu, Ted Xiao, Jacky Liang, Ishita Dasgupta, Annie Xie, Danny Driess, Ayzaan Wahid, Zhuo Xu, Quan Vuong, Tingnan Zhang, Tsang-Wei Edward Lee, Kuang-Huei Lee, Peng Xu, Sean Kirmani, Yuke Zhu, Andy Zeng, Karol Hausman, Nicolas Heess, Chelsea Finn, Sergey Levine, Brian Ichter

In each iteration, the image is annotated with a visual representation of proposals that the VLM can refer to (e. g., candidate robot actions, localizations, or trajectories).

Instruction Following Logical Reasoning +3

Paper
Add Code

AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

no code implementations • 23 Jan 2024 • Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Sean Kirmani, Edward Lee, Sergey Levine, Yao Lu, Isabel Leal, Sharath Maddineni, Kanishka Rao, Dorsa Sadigh, Pannag Sanketi, Pierre Sermanet, Quan Vuong, Stefan Welker, Fei Xia, Ted Xiao, Peng Xu, Steve Xu, Zhuo Xu

We experimentally show that such "in-the-wild" data collected by AutoRT is significantly more diverse, and that AutoRT's use of LLMs allows for instruction following data collection robots that can align to human preferences.

Instruction Following Scene Understanding

Paper
Add Code

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

no code implementations • 22 Jan 2024 • Boyuan Chen, Zhuo Xu, Sean Kirmani, Brian Ichter, Danny Driess, Pete Florence, Dorsa Sadigh, Leonidas Guibas, Fei Xia

By training a VLM on such data, we significantly enhance its ability on both qualitative and quantitative spatial VQA.

Question Answering Visual Question Answering

Paper
Add Code

Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis

no code implementations • 14 Dec 2023 • Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao, Yu Quan Chong, Chen Wang, Katia Sycara, Matthew Johnson-Roberson, Dhruv Batra, Xiaolong Wang, Sebastian Scherer, Zsolt Kira, Fei Xia, Yonatan Bisk

Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i. e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of robotics, and also exploring (ii) what a robotics-specific foundation model would look like.

Paper
Add Code

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

no code implementations • 7 Dec 2023 • Chengshu Li, Jacky Liang, Andy Zeng, Xinyun Chen, Karol Hausman, Dorsa Sadigh, Sergey Levine, Li Fei-Fei, Fei Xia, Brian Ichter

For example, consider prompting an LM to write code that counts the number of times it detects sarcasm in an essay: the LM may struggle to write an implementation for "detect_sarcasm(string)" that can be executed by the interpreter (handling the edge cases would be insurmountable).

Language Modelling

Paper
Add Code

Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future

2 code implementations • 6 Dec 2023 • Hongyang Li, Yang Li, Huijie Wang, Jia Zeng, Huilin Xu, Pinlong Cai, Li Chen, Junchi Yan, Feng Xu, Lu Xiong, Jingdong Wang, Futang Zhu, Chunjing Xu, Tiancai Wang, Fei Xia, Beipeng Mu, Zhihui Peng, Dahua Lin, Yu Qiao

With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem.

Autonomous Driving

374

Paper
Code

Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections

1 code implementation • 17 Nov 2023 • Lihan Zha, Yuchen Cui, Li-Heng Lin, Minae Kwon, Montserrat Gonzalez Arenas, Andy Zeng, Fei Xia, Dorsa Sadigh

DROC is able to respond to a sequence of online language corrections that address failures in both high-level task plans and low-level skill primitives.

Language Modelling Large Language Model +1

Paper
Code

Creative Robot Tool Use with Large Language Models

no code implementations • 19 Oct 2023 • Mengdi Xu, Peide Huang, Wenhao Yu, Shiqi Liu, Xilun Zhang, Yaru Niu, Tingnan Zhang, Fei Xia, Jie Tan, Ding Zhao

This paper investigates the feasibility of imbuing robots with the ability to creatively use tools in tasks that involve implicit physical constraints and long-term planning.

Motion Planning Task and Motion Planning

Paper
Add Code

Video Language Planning

no code implementations • 16 Oct 2023 • Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum, Leslie Kaelbling, Andy Zeng, Jonathan Tompson

We are interested in enabling visual planning for complex long-horizon tasks in the space of generated videos and language, leveraging recent advances in large generative models pretrained on Internet-scale data.

Paper
Add Code

Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning

no code implementations • 16 Oct 2023 • Dhruv Shah, Michael Equi, Blazej Osinski, Fei Xia, Brian Ichter, Sergey Levine

Navigation in unfamiliar environments presents a major challenge for robots: while mapping and planning techniques can be used to build up a representation of the world, quickly discovering a path to a desired goal in unfamiliar settings with such methods often requires lengthy mapping and exploration.

Language Modelling Navigate

Paper
Add Code

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

no code implementations • 18 Sep 2023 • Yevgen Chebotar, Quan Vuong, Alex Irpan, Karol Hausman, Fei Xia, Yao Lu, Aviral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, Keerthana Gopalakrishnan, Julian Ibarz, Ofir Nachum, Sumedh Sontakke, Grecia Salazar, Huong T Tran, Jodilyn Peralta, Clayton Tan, Deeksha Manjunath, Jaspiar Singht, Brianna Zitkovich, Tomas Jackson, Kanishka Rao, Chelsea Finn, Sergey Levine

In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data.

Imitation Learning Offline RL +2

Paper
Add Code

Physically Grounded Vision-Language Models for Robotic Manipulation

no code implementations • 5 Sep 2023 • Jensen Gao, Bidipta Sarkar, Fei Xia, Ted Xiao, Jiajun Wu, Brian Ichter, Anirudha Majumdar, Dorsa Sadigh

We incorporate this physically grounded VLM in an interactive framework with a large language model-based robotic planner, and show improved planning performance on tasks that require reasoning about physical object concepts, compared to baselines that do not leverage physically grounded VLMs.

Image Captioning Language Modelling +4

Paper
Add Code

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

1 code implementation • 28 Jul 2023 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.

Object Question Answering +1

269

Paper
Code

Deep Learning with Passive Optical Nonlinear Mapping

no code implementations • 17 Jul 2023 • Fei Xia, Kyungduk Kim, Yaniv Eliezer, Liam Shaughnessy, Sylvain Gigan, Hui Cao

Utilizing rapid optical information processing capabilities, our optical platforms could potentially offer more efficient and real-time processing solutions for a broad range of applications.

Data Compression Decoder +5

Paper
Add Code

Large Language Models as General Pattern Machines

no code implementations • 10 Jul 2023 • Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montserrat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, Andy Zeng

We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences -- from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstraction and Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art.

In-Context Learning

Paper
Add Code

Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners

no code implementations • 4 Jul 2023 • Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar

Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions.

Conformal Prediction Language Modelling +1

Paper
Add Code

Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

no code implementations • 29 Jun 2023 • Anthony Francis, Claudia Pérez-D'Arpino, Chengshu Li, Fei Xia, Alexandre Alahi, Rachid Alami, Aniket Bera, Abhijat Biswas, Joydeep Biswas, Rohan Chandra, Hao-Tien Lewis Chiang, Michael Everett, Sehoon Ha, Justin Hart, Jonathan P. How, Haresh Karnan, Tsang-Wei Edward Lee, Luis J. Manso, Reuth Mirksy, Sören Pirk, Phani Teja Singamaneni, Peter Stone, Ada V. Taylor, Peter Trautman, Nathan Tsoi, Marynel Vázquez, Xuesu Xiao, Peng Xu, Naoki Yokoyama, Alexander Toshev, Roberto Martín-Martín

A major challenge to deploying robots widely is navigation in human-populated environments, commonly referred to as social robot navigation.

Benchmarking Social Navigation

Paper
Add Code

Language to Rewards for Robotic Skill Synthesis

no code implementations • 14 Jun 2023 • Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik, Brian Ichter, Ted Xiao, Peng Xu, Andy Zeng, Tingnan Zhang, Nicolas Heess, Dorsa Sadigh, Jie Tan, Yuval Tassa, Fei Xia

However, since low-level robot actions are hardware-dependent and underrepresented in LLM training corpora, existing efforts in applying LLMs to robotics have largely treated LLMs as semantic planners or relied on human-engineered control primitives to interface with the robot.

In-Context Learning Logical Reasoning

Paper
Add Code

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear

1 code implementation • 1 Jun 2023 • Ruohan Gao, Hao Li, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Li Fei-Fei, Jiajun Wu

We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear.

Multi-Task Learning Visual Navigation

Paper
Code

IMBUE: In-Memory Boolean-to-CUrrent Inference ArchitecturE for Tsetlin Machines

no code implementations • 22 May 2023 • Omar Ghazal, Simranjeet Singh, Tousif Rahman, Shengqi Yu, Yujin Zheng, Domenico Balsamo, Sachin Patkar, Farhad Merchant, Fei Xia, Alex Yakovlev, Rishad Shafik

Non-volatile memory devices such as Resistive RAM (ReRAM) offer integrated switching and storage capabilities showing promising performance for ML applications.

Paper
Add Code

Large Language Models Need Holistically Thought in Medical Conversational QA

1 code implementation • 9 May 2023 • Yixuan Weng, Bin Li, Fei Xia, Minjun Zhu, Bin Sun, Shizhu He, Kang Liu, Jun Zhao

The medical conversational question answering (CQA) system aims at providing a series of professional medical services to improve the efficiency of medical care.

Conversational Question Answering

Paper
Code

Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks

3 code implementations • 4 Apr 2023 • Yixuan Weng, Minjun Zhu, Fei Xia, Bin Li, Shizhu He, Kang Liu, Jun Zhao

Our work highlights the potential of seamlessly unifying explicit rule learning via CoNNs and implicit pattern learning in LMs, paving the way for true symbolic comprehension capabilities.

Arithmetic Reasoning Language Modelling

Paper
Code

PaLM-E: An Embodied Multimodal Language Model

2 code implementations • 6 Mar 2023 • Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence

Large language models excel at a wide range of complex tasks.

Ranked #2 on Visual Question Answering (VQA) on OK-VQA

Language Modelling Large Language Model +2

204

Paper
Code

Open-World Object Manipulation using Pre-trained Vision-Language Models

no code implementations • 2 Mar 2023 • Austin Stone, Ted Xiao, Yao Lu, Keerthana Gopalakrishnan, Kuang-Huei Lee, Quan Vuong, Paul Wohlhart, Sean Kirmani, Brianna Zitkovich, Fei Xia, Chelsea Finn, Karol Hausman

This brings up a notably difficult challenge for robots: while robot learning approaches allow robots to learn many different behaviors from first-hand experience, it is impractical for robots to have first-hand experiences that span all of this semantic information.

Language Modelling Object

Paper
Add Code

Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents

no code implementations • NeurIPS 2023 • Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter

Recent progress in large language models (LLMs) has demonstrated the ability to learn and leverage Internet-scale knowledge through pre-training with autoregressive models.

Language Modelling Text Generation

Paper
Add Code

Scaling Robot Learning with Semantically Imagined Experience

no code implementations • 22 Feb 2023 • Tianhe Yu, Ted Xiao, Austin Stone, Jonathan Tompson, Anthony Brohan, Su Wang, Jaspiar Singh, Clayton Tan, Dee M, Jodilyn Peralta, Brian Ichter, Karol Hausman, Fei Xia

Specifically, we make use of the state of the art text-to-image diffusion models and perform aggressive data augmentation on top of our existing robotic manipulation datasets via inpainting various unseen objects for manipulation, backgrounds, and distractors with text guidance.

Data Augmentation

Paper
Add Code

Large Language Models are Better Reasoners with Self-Verification

1 code implementation • 19 Dec 2022 • Yixuan Weng, Minjun Zhu, Fei Xia, Bin Li, Shizhu He, Shengping Liu, Bin Sun, Kang Liu, Jun Zhao

By performing a backward verification of the answers that LLM deduced for itself, we can obtain interpretable answer validation scores to select the candidate answer with the highest score.

Arithmetic Reasoning Common Sense Reasoning +3

Paper
Code

RT-1: Robotics Transformer for Real-World Control at Scale

1 code implementation • 13 Dec 2022 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance.

1,196

Paper
Code

A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations

no code implementations • 29 Nov 2022 • Sohan Rudra, Saksham Goel, Anirban Santara, Claudio Gentile, Laurent Perron, Fei Xia, Vikas Sindhwani, Carolina Parada, Gaurav Aggarwal

Object-goal navigation (Object-nav) entails searching, recognizing and navigating to a target object.

Object

Paper
Add Code

Robotic Table Wiping via Reinforcement Learning and Whole-body Trajectory Optimization

no code implementations • 19 Oct 2022 • Thomas Lew, Sumeet Singh, Mario Prats, Jeffrey Bingham, Jonathan Weisz, Benjie Holson, Xiaohan Zhang, Vikas Sindhwani, Yao Lu, Fei Xia, Peng Xu, Tingnan Zhang, Jie Tan, Montserrat Gonzalez

This problem is challenging, as it requires planning wiping actions while reasoning over uncertain latent dynamics of crumbs and spills captured via high-dimensional visual observations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds

no code implementations • 13 Oct 2022 • Pei Sun, Mingxing Tan, Weiyue Wang, Chenxi Liu, Fei Xia, Zhaoqi Leng, Dragomir Anguelov

3D object detection in point clouds is a core component for modern robotics and autonomous driving systems.

3D Object Detection Autonomous Driving +2

Paper
Add Code

Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation

no code implementations • 22 Sep 2022 • Xuesu Xiao, Tingnan Zhang, Krzysztof Choromanski, Edward Lee, Anthony Francis, Jake Varley, Stephen Tu, Sumeet Singh, Peng Xu, Fei Xia, Sven Mikael Persson, Dmitry Kalashnikov, Leila Takayama, Roy Frostig, Jie Tan, Carolina Parada, Vikas Sindhwani

Despite decades of research, existing navigation systems still face real-world challenges when deployed in the wild, e. g., in cluttered home environments or in human-occupied public spaces.

Imitation Learning Model Predictive Control

Paper
Add Code

Open-vocabulary Queryable Scene Representations for Real World Planning

no code implementations • 20 Sep 2022 • Boyuan Chen, Fei Xia, Brian Ichter, Kanishka Rao, Keerthana Gopalakrishnan, Michael S. Ryoo, Austin Stone, Daniel Kappler

Large language models (LLMs) have unlocked new capabilities of task planning from human instructions.

Paper
Add Code

6D Camera Relocalization in Visually Ambiguous Extreme Environments

no code implementations • 13 Jul 2022 • Yang Zheng, Tolga Birdal, Fei Xia, Yanchao Yang, Yueqi Duan, Leonidas J. Guibas

To this end, we propose: (i) a hierarchical localization system, where we leverage temporal information and (ii) a novel environment-aware image enhancement method to boost the robustness and accuracy.

Camera Relocalization Image Enhancement

Paper
Add Code

Inner Monologue: Embodied Reasoning through Planning with Language Models

no code implementations • 12 Jul 2022 • Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian Ichter

We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction.

Paper
Add Code

BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents

no code implementations • 13 Jun 2022 • Ziang Liu, Roberto Martín-Martín, Fei Xia, Jiajun Wu, Li Fei-Fei

Robots excel in performing repetitive and precision-sensitive tasks in controlled environments such as warehouses and factories, but have not been yet extended to embodied AI agents providing assistance in household tasks.

Benchmarking

Paper
Add Code

LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs

1 code implementation • 20 Apr 2022 • Fei Xia, Bin Li, Yixuan Weng, Shizhu He, Kang Liu, Bin Sun, Shutao Li, Jun Zhao

The medical conversational system can relieve the burden of doctors and improve the efficiency of healthcare, especially during the pandemic.

Conversational Question Answering Dialogue Generation +3

Paper
Code

Towards Better Chinese-centric Neural Machine Translation for Low-resource Languages

1 code implementation • 9 Apr 2022 • Bin Li, Yixuan Weng, Fei Xia, Hanjun Deng

The last decade has witnessed enormous improvements in science and technology, stimulating the growing demand for economic and cultural exchanges in various countries.

Machine Translation NMT +3

Paper
Code

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

3 code implementations • 4 Apr 2022 • Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Chuyuan Fu, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan, Andy Zeng

We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions, while value functions associated with these skills provide the grounding necessary to connect this knowledge to a particular physical environment.

Decision Making Language Modelling +1

163

Paper
Code

Multi-Robot Active Mapping via Neural Bipartite Graph Matching

no code implementations • CVPR 2022 • Kai Ye, Siyan Dong, Qingnan Fan, He Wang, Li Yi, Fei Xia, Jue Wang, Baoquan Chen

Previous approaches either choose the frontier as the goal position via a myopic solution that hinders the time efficiency, or maximize the long-term value via reinforcement learning to directly regress the goal position, but does not guarantee the complete map construction.

Graph Matching Position +2

Paper
Add Code

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

15 code implementations • 28 Jan 2022 • Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning.

Ranked #36 on Common Sense Reasoning on CommonsenseQA

Common Sense Reasoning GSM8K +2

17,416

Paper
Code

ADBCMM : Acronym Disambiguation by Building Counterfactuals and Multilingual Mixing

1 code implementation • 8 Dec 2021 • Yixuan Weng, Fei Xia, Bin Li, Xiusheng Huang, Shizhu He

To address the above issue, this paper proposes an new method for acronym disambiguation, named as ADBCMM, which can significantly improve the performance of low-resource languages by building counterfactuals and multilingual mixing.

Task 2

Paper
Code

SimCLAD: A Simple Framework for Contrastive Learning of Acronym Disambiguation

no code implementations • 29 Nov 2021 • Bin Li, Fei Xia, Yixuan Weng, Xiusheng Huang, Bin Sun

In this paper, we propose a Simple framework for Contrastive Learning of Acronym Disambiguation (SimCLAD) method to better understand the acronym meanings.

Contrastive Learning document understanding +1

Paper
Add Code

PSG: Prompt-based Sequence Generation for Acronym Extraction

no code implementations • 29 Nov 2021 • Bin Li, Fei Xia, Yixuan Weng, Xiusheng Huang, Bin Sun, Shutao Li

In this paper, we propose a Prompt-based Sequence Generation (PSG) method for the acronym extraction task.

document understanding Language Modelling +1

Paper
Add Code

Extracting and Inferring Personal Attributes from Dialogue

1 code implementation • NLP4ConvAI (ACL) 2022 • Zhilin Wang, Xuhui Zhou, Rik Koncel-Kedziorski, Alex Marin, Fei Xia

Personal attributes represent structured information about a person, such as their hobbies, pets, family, likes and dislikes.

Attribute Language Modelling

Paper
Code

Auto-Split: A General Framework of Collaborative Edge-Cloud AI

1 code implementation • 30 Aug 2021 • Amin Banitalebi-Dehkordi, Naveen Vedula, Jian Pei, Fei Xia, Lanjun Wang, Yong Zhang

At the same time, large amounts of input data are collected at the edge of cloud.

Paper
Code

iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks

1 code implementation • 6 Aug 2021 • Chengshu Li, Fei Xia, Roberto Martín-Martín, Michael Lingelbach, Sanjana Srivastava, Bokui Shen, Kent Vainio, Cem Gokmen, Gokul Dharan, Tanish Jain, Andrey Kurenkov, C. Karen Liu, Hyowon Gweon, Jiajun Wu, Li Fei-Fei, Silvio Savarese

We evaluate the new capabilities of iGibson 2. 0 to enable robot learning of novel tasks, in the hope of demonstrating the potential of this new simulator to support new research in embodied AI.

Imitation Learning

606

Paper
Code

BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments

no code implementations • 6 Aug 2021 • Sanjana Srivastava, Chengshu Li, Michael Lingelbach, Roberto Martín-Martín, Fei Xia, Kent Vainio, Zheng Lian, Cem Gokmen, Shyamal Buch, C. Karen Liu, Silvio Savarese, Hyowon Gweon, Jiajun Wu, Li Fei-Fei

We introduce BEHAVIOR, a benchmark for embodied AI with 100 activities in simulation, spanning a range of everyday household chores such as cleaning, maintenance, and food preparation.

Paper
Add Code

A Masked Segmental Language Model for Unsupervised Natural Language Segmentation

1 code implementation • NAACL (SIGMORPHON) 2022 • C. M. Downey, Fei Xia, Gina-Anne Levow, Shane Steinert-Threlkeld

Segmentation remains an important preprocessing step both in languages where "words" or other important syntactic/semantic units (like morphemes) are not clearly delineated by white space, as well as when dealing with continuous speech data, where there is often no meaningful pause between words.

Language Modelling Segmentation

Paper
Code

QoS-Aware Power Minimization of Distributed Many-Core Servers using Transfer Q-Learning

no code implementations • 2 Feb 2021 • Dainius Jenkus, Fei Xia, Rishad Shafik, Alex Yakovlev

Then, it is coupled with vertical scaling using transfer Q-learning, which further tunes power/performance based on workload profile using dynamic voltage/frequency scaling (DVFS).

Q-Learning

Paper
Add Code

Towards Accurate Active Camera Localization

1 code implementation • 8 Dec 2020 • Qihang Fang, Yingda Yin, Qingnan Fan, Fei Xia, Siyan Dong, Sheng Wang, Jue Wang, Leonidas Guibas, Baoquan Chen

These approaches localize the camera in the discrete pose space and are agnostic to the localization-driven scene property, which restricts the camera pose accuracy in the coarse scale.

Camera Localization Pose Estimation +1

Paper
Code

IGibson 1.0: a Simulation Environment for Interactive Tasks in Large Realistic Scenes

2 code implementations • 5 Dec 2020 • Bokui Shen, Fei Xia, Chengshu Li, Roberto Martín-Martín, Linxi Fan, Guanzhi Wang, Claudia Pérez-D'Arpino, Shyamal Buch, Sanjana Srivastava, Lyne P. Tchapmi, Micael E. Tchapmi, Kent Vainio, Josiah Wong, Li Fei-Fei, Silvio Savarese

We present iGibson 1. 0, a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes.

Imitation Learning

606

Paper
Code

Joint Chinese Word Segmentation and Part-of-speech Tagging via Multi-channel Attention of Character N-grams

1 code implementation • COLING 2020 • Yuanhe Tian, Yan Song, Fei Xia

However, their work on modeling such contextual features is limited to concatenating the features or their embeddings directly with the input embeddings without distinguishing whether the contextual features are important for the joint task in the specific context.

Chinese Word Segmentation Part-Of-Speech Tagging +2

Paper
Code

Summarizing Medical Conversations via Identifying Important Utterances

1 code implementation • COLING 2020 • Yan Song, Yuanhe Tian, Nan Wang, Fei Xia

For the particular dataset used in this study, we show that high-quality summaries can be generated by extracting two types of utterances, namely, problem statements and treatment recommendations.

Paper
Code

NLPStatTest: A Toolkit for Comparing NLP System Performance

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Haotian Zhu, Denise Mak, Jesse Gioannini, Fei Xia

The toolkit provides a convenient and systematic way to compare NLP system performance that goes beyond statistical significance testing

Paper
Code

Improving Biomedical Named Entity Recognition with Syntactic Information

1 code implementation • BMC Bioinformatics 2020 • Yuanhe Tian, Wang Shen, Yan Song, Fei Xia, Min He, Kenli Li

The experimental results on six English benchmark datasets demonstrate that auto-processed syntactic information can be a useful resource for BioNER and our method with KVMN can appropriately leverage such information to improve model performance.

Ranked #1 on Named Entity Recognition (NER) on Species-800

named-entity-recognition Named Entity Recognition +2

Paper
Code

Improving Constituency Parsing with Span Attention

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Yuanhe Tian, Yan Song, Fei Xia, Tong Zhang

Constituency parsing is a fundamental and important task for natural language understanding, where a good representation of contextual information can help this task.

Ranked #1 on Constituency Parsing on ATB

Constituency Parsing Natural Language Understanding +1

Paper
Code

Supertagging Combinatory Categorial Grammar with Attentive Graph Convolutional Networks

1 code implementation • EMNLP 2020 • Yuanhe Tian, Yan Song, Fei Xia

Specifically, we build the graph from chunks (n-grams) extracted from a lexicon and apply attention over the graph, so that different word pairs from the contexts within and across chunks are weighted in the model and facilitate the supertagging accordingly.

Ranked #2 on CCG Supertagging on CCGbank

CCG Supertagging

Paper
Code

ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation

no code implementations • 18 Aug 2020 • Fei Xia, Chengshu Li, Roberto Martín-Martín, Or Litany, Alexander Toshev, Silvio Savarese

To validate our method, we apply ReLMoGen to two types of tasks: 1) Interactive Navigation tasks, navigation problems where interactions with the environment are required to reach the destination, and 2) Mobile Manipulation tasks, manipulation tasks that require moving the robot base.

Continuous Control Hierarchical Reinforcement Learning +2

Paper
Add Code

Studying Challenges in Medical Conversation with Structured Annotation

no code implementations • WS 2020 • Nan Wang, Yan Song, Fei Xia

Medical conversation is a central part of medical care.

Paper
Add Code

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge

1 code implementation • ACL 2020 • Yuanhe Tian, Yan Song, Xiang Ao, Fei Xia, Xiaojun Quan, Tong Zhang, Yonggang Wang

Chinese word segmentation (CWS) and part-of-speech (POS) tagging are important fundamental tasks for Chinese language processing, where joint learning of them is an effective one-step solution for both tasks.

Chinese Word Segmentation Part-Of-Speech Tagging +2

Paper
Code

Improving Chinese Word Segmentation with Wordhood Memory Networks

1 code implementation • ACL 2020 • Yuanhe Tian, Yan Song, Fei Xia, Tong Zhang, Yonggang Wang

Contextual features always play an important role in Chinese word segmentation (CWS).

Ranked #1 on Chinese Word Segmentation on CITYU

Chinese Word Segmentation Decoder

173

Paper
Code

Interactive Gibson Benchmark (iGibson 0.5): A Benchmark for Interactive Navigation in Cluttered Environments

1 code implementation • 30 Oct 2019 • Fei Xia, William B. Shen, Chengshu Li, Priya Kasimbeg, Micael Tchapmi, Alexander Toshev, Li Fei-Fei, Roberto Martín-Martín, Silvio Savarese

We present Interactive Gibson Benchmark, the first comprehensive benchmark for training and evaluating Interactive Navigation: robot navigation strategies where physical interaction with objects is allowed and even encouraged to accomplish a task.

Robot Navigation

606

Paper
Code

HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile Manipulators

1 code implementation • 24 Oct 2019 • Chengshu Li, Fei Xia, Roberto Martin-Martin, Silvio Savarese

Different from other HRL solutions, HRL4IN handles the heterogeneous nature of the Interactive Navigation task by creating subgoals in different spaces in different phases of the task.

Hierarchical Reinforcement Learning reinforcement-learning +1

Paper
Code

Neural Network Design for Energy-Autonomous AI Applications using Temporal Encoding

no code implementations • 15 Oct 2019 • Sergey Mileiko, Thanasin Bunnam, Fei Xia, Rishad Shafik, Alex Yakovlev, Shidhartha Das

We design a PWM-based perceptron which can serve as the fundamental building block for NNs, by using an entirely new method of realising arithmetic in the PWM domain.

Paper
Add Code

WTMED at MEDIQA 2019: A Hybrid Approach to Biomedical Natural Language Inference

1 code implementation • WS 2019 • Zhaofeng Wu, Yan Song, Sicong Huang, Yuanhe Tian, Fei Xia

Natural language inference (NLI) is challenging, especially when it is applied to technical domains such as biomedical settings.

Natural Language Inference

Paper
Code

ChiMed: A Chinese Medical Corpus for Question Answering

1 code implementation • WS 2019 • Yuanhe Tian, Weicheng Ma, Fei Xia, Yan Song

Question answering (QA) is a challenging task in natural language processing (NLP), especially when it is applied to specific domains.

Question Answering

Paper
Code

PoseRBPF: A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking

1 code implementation • 22 May 2019 • Xinke Deng, Arsalan Mousavian, Yu Xiang, Fei Xia, Timothy Bretl, Dieter Fox

In this work, we formulate the 6D object pose tracking problem in the Rao-Blackwellized particle filtering framework, where the 3D rotation and the 3D translation of an object are decoupled.

6D Pose Estimation 6D Pose Estimation using RGB +3

130

Paper
Code

A Behavioral Approach to Visual Navigation with Graph Localization Networks

no code implementations • 1 Mar 2019 • Kevin Chen, Juan Pablo de Vicente, Gabriel Sepulveda, Fei Xia, Alvaro Soto, Marynel Vazquez, Silvio Savarese

Inspired by research in psychology, we introduce a behavioral approach for visual navigation using topological maps.

Navigate Visual Navigation

Paper
Add Code

Composite Shape Modeling via Latent Space Factorization

no code implementations • ICCV 2019 • Anastasia Dubrovina, Fei Xia, Panos Achlioptas, Mira Shalah, Raphael Groscot, Leonidas Guibas

We present a novel neural network architecture, termed Decomposer-Composer, for semantic structure-aware 3D shape modeling.

3D Shape Modeling

Paper
Add Code

Gibson Env: Real-World Perception for Embodied Agents

5 code implementations • CVPR 2018 • Fei Xia, Amir Zamir, Zhi-Yang He, Alexander Sax, Jitendra Malik, Silvio Savarese

Developing visual perception models for active agents and sensorimotor control are cumbersome to be done in the physical world, as existing algorithms are too slow to efficiently learn in real-time and robots are fragile and costly.

Domain Adaptation General Reinforcement Learning +1

825

Paper
Code

Coding Structures and Actions with the COSTA Scheme in Medical Conversations

no code implementations • WS 2018 • Nan Wang, Yan Song, Fei Xia

This paper describes the COSTA scheme for coding structures and actions in conversation.

Decision Making

Paper
Add Code

VUNet: Dynamic Scene View Synthesis for Traversability Estimation using an RGB Camera

no code implementations • 22 Jun 2018 • Noriaki Hirose, Amir Sadeghian, Fei Xia, Roberto Martin-Martin, Silvio Savarese

We present VUNet, a novel view(VU) synthesis method for mobile robots in dynamic environments, and its application to the estimation of future traversability.

Autonomous Vehicles

Paper
Add Code

PDF-to-Text Reanalysis for Linguistic Data Mining

no code implementations • LREC 2018 • Michael Wayne Goodman, Ryan Georgi, Fei Xia

Optical Character Recognition (OCR)

Paper
Add Code

Constructing a Chinese Medical Conversation Corpus Annotated with Conversational Structures and Actions

no code implementations • LREC 2018 • Nan Wang, Yan Song, Fei Xia

Paper
Add Code

NeuralFDR: Learning Discovery Thresholds from Hypothesis Features

1 code implementation • NeurIPS 2017 • Fei Xia, Martin J. Zhang, James Zou, David Tse

For example, in genetic association studies, each hypothesis tests the correlation between a variant and the trait.

Paper
Code

CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles

no code implementations • EMNLP 2017 • Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, Anbang Xu

Crowdsourcing has proven to be an effective method for generating labeled data for a range of NLP tasks.

Machine Translation Question Answering +1

Paper
Add Code

Learning Word Representations with Regularization from Prior Knowledge

no code implementations • CONLL 2017 • Yan Song, Chia-Jung Lee, Fei Xia

This paper presents a unified framework that leverages pre-learned or external priors, in the form of a regularizer, for enhancing conventional language model-based embedding learning.

Language Modelling Learning Word Embeddings +3

Paper
Add Code

Computational Support for Finding Word Classes: A Case Study of Abui

no code implementations • WS 2017 • Olga Zamaraeva, Franti{\v{s}}ek Kratochv{\'\i}l, Emily M. Bender, Fei Xia, Kristen Howell

Paper
Add Code

Inferring Case Systems from IGT: Enriching the Enrichment

no code implementations • WS 2017 • Kristen Howell, Emily M. Bender, Michel Lockwood, Fei Xia, Olga Zamaraeva

Paper
Add Code

STREAMLInED Challenges: Aligning Research Interests with Shared Tasks

no code implementations • WS 2017 • Gina-Anne Levow, Emily M. Bender, Patrick Littell, Kristen Howell, Shobhana Chelliah, Joshua Crowgey, Dan Garrette, Jeff Good, Sharon Hargus, David Inman, Michael Maxwell, Michael Tjalve, Fei Xia

Paper
Add Code

A Web-framework for ODIN Annotation

no code implementations • ACL 2016 • Ryan Georgi, Michael Wayne Goodman, Fei Xia

Paper
Add Code

Capturing divergence in dependency trees to improve syntactic projection

no code implementations • 14 May 2016 • Ryan Georgi, Fei Xia, William D. Lewis

These patterns can then be used to improve structural projection algorithms, allowing for better performing NLP tools for resource-poor languages, in particular those that may not have large amounts of annotated data necessary for traditional, fully-supervised methods.

Word Alignment

Paper
Add Code

Annotating and Detecting Medical Events in Clinical Notes

no code implementations • LREC 2016 • Prescott Klassen, Fei Xia, Meliha Yetisgen

Early detection and treatment of diseases that onset after a patient is admitted to a hospital, such as pneumonia, is critical to improving and reducing costs in healthcare.

Negation

Paper
Add Code

Enriching Interlinear Text using Automatically Constructed Annotators

no code implementations • WS 2015 • Ryan Georgi, Fei Xia, William Lewis

Paper
Add Code

Learning Grammar Specifications from IGT: A Case Study of Chintang

no code implementations • WS 2014 • Emily M. Bender, Joshua Crowgey, Michael Wayne Goodman, Fei Xia

Paper
Add Code

Unsupervised Dependency Parsing with Transferring Distribution via Parallel Guidance and Entropy Regularization

no code implementations • ACL 2014 • Xuezhe Ma, Fei Xia

Machine Translation Relation Extraction +1

Paper
Add Code

Enriching ODIN

no code implementations • LREC 2014 • Fei Xia, William Lewis, Michael Wayne Goodman, Joshua Crowgey, Emily M. Bender

In this paper, we describe the expansion of the ODIN resource, a database containing many thousands of instances of Interlinear Glossed Text (IGT) for over a thousand languages harvested from scholarly linguistic papers posted to the Web.

Paper
Add Code

Modern Chinese Helps Archaic Chinese Processing: Finding and Exploiting the Shared Properties

no code implementations • LREC 2014 • Yan Song, Fei Xia

Languages change over time and ancient languages have been studied in linguistics and other related fields.

POS Sentiment Analysis

Paper
Add Code

Annotating Clinical Events in Text Snippets for Phenotype Detection

no code implementations • LREC 2014 • Prescott Klassen, Fei Xia, V, Lucy erwende, Meliha Yetisgen

Early detection and treatment of diseases that onset after a patient is admitted to a hospital, such as pneumonia, is critical to improving and reducing costs in healthcare.

Pneumonia Detection

Paper
Add Code

Discriminative Relational Topic Models

no code implementations • 9 Oct 2013 • Ning Chen, Jun Zhu, Fei Xia, Bo Zhang

Many scientific and engineering fields involve analyzing network data.

Bayesian Inference Data Augmentation +1

Paper
Add Code

A Common Case of Jekyll and Hyde: The Synergistic Effect of Using Divided Source Training Data for Feature Augmentation

no code implementations • IJCNLP 2013 • Yan Song, Fei Xia

Chinese Word Segmentation Domain Adaptation +1

Paper
Add Code

Towards Creating Precision Grammars from Interlinear Glossed Text: Inferring Large-Scale Typological Properties

no code implementations • WS 2013 • Emily M. Bender, Michael Wayne Goodman, Joshua Crowgey, Fei Xia

Paper
Add Code

Dependency Parser Adaptation with Subtrees from Auto-Parsed Target Domain Data

no code implementations • ACL 2013 • Xuezhe Ma, Fei Xia

Constituency Parsing Dependency Parsing +1

Paper
Add Code

Enhanced and Portable Dependency Projection Algorithms Using Interlinear Glossed Text

no code implementations • ACL 2013 • Ryan Georgi, Fei Xia, William D. Lewis

Word Alignment

Paper
Add Code

Annotating Change of State for Clinical Events

no code implementations • WS 2013 • V, Lucy erwende, Fei Xia, Meliha Yetisgen-Yildiz

Paper
Add Code

Improving Dependency Parsing with Interlinear Glossed Text and Syntactic Projection

no code implementations • COLING 2012 • Ryan Georgi, Fei Xia, William Lewis

Dependency Parsing Word Alignment

Paper
Add Code

Entropy-based Training Data Selection for Domain Adaptation

no code implementations • COLING 2012 • Yan Song, Prescott Klassen, Fei Xia, Chunyu Kit

Chinese Word Segmentation Domain Adaptation +2

Paper
Add Code

Creating a Tree Adjoining Grammar from a Multilayer Treebank

no code implementations • WS 2012 • Rajesh Bhatt, Owen Rambow, Fei Xia

Paper
Add Code

Effort of Genre Variation and Prediction of System Performance

no code implementations • LREC 2012 • Dong Wang, Fei Xia

Our experiments show that the predicted scores are close to the real scores when tested on the CTB data.

Domain Adaptation Language Modelling +3

Paper
Add Code

Using a Goodness Measurement for Domain Adaptation: A Case Study on Chinese Word Segmentation

no code implementations • LREC 2012 • Yan Song, Fei Xia

Domain adaptation is an important topic for natural language processing.

Chinese Word Segmentation Domain Adaptation +1

Paper
Add Code

Statistical Section Segmentation in Free-Text Clinical Records

no code implementations • LREC 2012 • Michael Tepper, Daniel Capurro, Fei Xia, V, Lucy erwende, Meliha Yetisgen-Yildiz

Automatically segmenting and classifying clinical free text into sections is an important first step to automatic information retrieval, information extraction and data mining tasks, as it helps to ground the significance of the text within.

General Classification Information Retrieval +4

Paper
Add Code

Measuring the Divergence of Dependency Structures Cross-Linguistically to Improve Syntactic Projection Algorithms

no code implementations • LREC 2012 • Ryan Georgi, Fei Xia, William Lewis

Syntactic parses can provide valuable information for many NLP tasks, such as machine translation, semantic analysis, etc.

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.