Search Results for author: Nan Tang

Found 18 papers, 8 papers with code

BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation

no code implementations • 27 May 2024 • Chengxing Jia, Pengyuan Wang, Ziniu Li, Yi-Chen Li, Zhilong Zhang, Nan Tang, Yang Yu

In a similar vein, our proposed system, the BWArea model, conceptualizes language generation as a decision-making task.

Paper
Add Code

Evaluating Task-based Effectiveness of MLLMs on Charts

no code implementations • 11 May 2024 • Yifan Wu, Lutao Yan, Yuyu Luo, Yunhai Wang, Nan Tang

Thirdly, we propose a novel textual prompt strategy, named Chain-of-Charts, tailored for low-level analysis tasks, which boosts model performance by 24. 36%, resulting in an accuracy of 80. 49%.

Paper
Add Code

Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts

no code implementations • 14 Apr 2024 • Jing-Cheng Pang, Si-Hang Yang, Kaiyuan Li, Jiaji Zhang, Xiong-Hui Chen, Nan Tang, Yang Yu

Furthermore, KALM effectively enables the LLM to comprehend environmental dynamics, resulting in the generation of meaningful imaginary rollouts that reflect novel skills and demonstrate the seamless integration of large language models and reinforcement learning.

Language Modelling Large Language Model +2

Paper
Add Code

Empowering Language Models with Active Inquiry for Deeper Understanding

no code implementations • 6 Feb 2024 • Jing-Cheng Pang, Heng-Bo Fan, Pengyuan Wang, Jia-Hao Xiao, Nan Tang, Si-Hang Yang, Chengxing Jia, Sheng-Jun Huang, Yang Yu

The rise of large language models (LLMs) has revolutionized the way that we interact with artificial intelligence systems through natural language.

Active Learning Language Modelling +1

Paper
Add Code

Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration

1 code implementation • 7 Dec 2023 • Meihao Fan, Xiaoyue Han, Ju Fan, Chengliang Chai, Nan Tang, Guoliang Li, Xiaoyong Du

However, existing ICL approaches to ER typically necessitate providing a task description and a set of demonstrations for each entity pair and thus have limitations on the monetary cost of interfacing LLMs.

Entity Resolution In-Context Learning

Paper
Code

SEED: Domain-Specific Data Curation With Large Language Models

no code implementations • 1 Oct 2023 • Zui Chen, Lei Cao, Sam Madden, Tim Kraska, Zeyuan Shang, Ju Fan, Nan Tang, Zihui Gu, Chunwei Liu, Michael Cafarella

As a result, data scientists often have to develop domain-specific solutions tailored to both the dataset and the task, e. g. writing domain-specific code or training machine learning models on a sufficient number of annotated examples.

Code Generation Imputation +1

Paper
Add Code

VerifAI: Verified Generative AI

no code implementations • 6 Jul 2023 • Nan Tang, Chenyu Yang, Ju Fan, Lei Cao, Yuyu Luo, Alon Halevy

We propose that verifying the outputs of generative AI from a data management perspective is an emerging issue for generative AI.

Decision Making Knowledge Graphs +2

Paper
Add Code

Interleaving Pre-Trained Language Models and Large Language Models for Zero-Shot NL2SQL Generation

1 code implementation • 15 Jun 2023 • Zihui Gu, Ju Fan, Nan Tang, Songyue Zhang, Yuxin Zhang, Zui Chen, Lei Cao, Guoliang Li, Sam Madden, Xiaoyong Du

PLMs can perform well in schema alignment but struggle to achieve complex reasoning, while LLMs is superior in complex reasoning tasks but cannot achieve precise schema alignment.

Paper
Code

Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration

1 code implementation • SIGMOD/PODS 2023 • Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Guoliang Li, Xiaoyong Du, Xiaofeng Jia, Song Gao

The widely used practice is to build task-specific or even dataset-specific solutions, which are hard to generalize and disable the opportunities of knowledge sharing that can be learned from different datasets and multiple tasks.

Entity Resolution Zero-Shot Learning

Paper
Code

ChatPipe: Orchestrating Data Preparation Program by Optimizing Human-ChatGPT Interactions

no code implementations • 7 Apr 2023 • Sibei Chen, Hanbing Liu, Weiting Jin, Xiangyu Sun, Xiaoyao Feng, Ju Fan, Xiaoyong Du, Nan Tang

Orchestrating a high-quality data preparation program is essential for successful machine learning (ML), but it is known to be time and effort consuming.

Paper
Add Code

RetClean: Retrieval-Based Data Cleaning Using Foundation Models and Data Lakes

no code implementations • 29 Mar 2023 • Mohammad Shahmeer Ahmad, Zan Ahmad Naeem, Mohamed Eltabakh, Mourad Ouzzani, Nan Tang

To assist with this scenario, we developed a custom RoBERTa-based foundation model that can be locally deployed.

Retrieval

Paper
Add Code

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

1 code implementation • 5 Nov 2022 • Zihui Gu, Ju Fan, Nan Tang, Preslav Nakov, Xiaoman Zhao, Xiaoyong Du

In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms the previous state of the art by 4. 7 points (85. 6% vs. 80. 9%), and the gap between PASTA and human performance on the small TabFact test set is narrowed to just 1. 5 points (90. 6% vs. 92. 1%).

Ranked #2 on Table-based Fact Verification on TabFact

Fact Checking Fact Verification +5

Paper
Code

Domain Adaptation for Deep Entity Resolution: A Design Space Exploration

1 code implementation • SIGMOD/PODS 2022 • Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Chengliang Chai, Guoliang Li, Ruixue Fan, Xiaoyong Du

Entity resolution (ER) is a core problem of data integration.

Ranked #2 on Entity Resolution on WDC Watches-small

Domain Adaptation Entity Resolution

Paper
Code

Deep learning for blocking in entity matching: a design space exploration

1 code implementation • Proceedings of the VLDB Endowment 2021 • Saravanan Thirumuruganathan, Han Li, Nan Tang, Mourad Ouzzani, Yash Govind, Derek Paulsen, Glenn Fung, AnHai Doan

In this paper, we develop the DeepBlocker framework that significantly advances the state of the art in applying DL to blocking for EM.

Ranked #5 on Blocking on Abt-Buy

Blocking

Paper
Code

RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation

no code implementations • 4 Dec 2020 • Nan Tang, Ju Fan, Fangyi Li, Jianhong Tu, Xiaoyong Du, Guoliang Li, Sam Madden, Mourad Ouzzani

RPT is pre-trained for a tuple-to-tuple model by corrupting the input tuple and then learning a model to reconstruct the original tuple.

Decoder Denoising +5

Paper
Add Code

Deductive Optimization of Relational Data Storage

1 code implementation • 8 Mar 2019 • John K. Feser, Samuel Madden, Nan Tang, Armando Solar-Lezama

Optimizing the physical data storage and retrieval of data are two key database management problems.

Programming Languages Databases

Paper
Code

Reuse and Adaptation for Entity Resolution through Transfer Learning

no code implementations • 28 Sep 2018 • Saravanan Thirumuruganathan, Shameem A Puthiya Parambath, Mourad Ouzzani, Nan Tang, Shafiq Joty

Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results.

Entity Resolution Feature Engineering +1

Paper
Add Code

DeepER -- Deep Entity Resolution

3 code implementations • 2 Oct 2017 • Muhammad Ebraheem, Saravanan Thirumuruganathan, Shafiq Joty, Mourad Ouzzani, Nan Tang

word embeddings), we present a novel ER system, called DeepER, that achieves good accuracy, high efficiency, as well as ease-of-use (i. e., much less human efforts).

Databases

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.