Search Results for author: Taro Watanabe

Found 61 papers, 18 papers with code

Structured Refinement for Sequential Labeling

no code implementations • Findings (ACL) 2021 • Yiran Wang, Hiroyuki Shindo, Yuji Matsumoto, Taro Watanabe

Paper
Add Code

A Text Editing Approach to Joint Japanese Word Segmentation, POS Tagging, and Lexical Normalization

no code implementations • WNUT (ACL) 2021 • Shohei Higashiyama, Masao Utiyama, Taro Watanabe, Eiichiro Sumita

Lexical normalization, in addition to word segmentation and part-of-speech tagging, is a fundamental task for Japanese user-generated text processing.

Japanese Word Segmentation Lexical Normalization +3

Paper
Add Code

What Works and Doesn’t Work, A Deep Decoder for Neural Machine Translation

no code implementations • Findings (ACL) 2022 • Zuchao Li, Yiran Wang, Masao Utiyama, Eiichiro Sumita, Hai Zhao, Taro Watanabe

Inspired by this discovery, we then propose approaches to improving it, with respect to model structure and model training, to make the deep decoder practical in NMT.

Language Modelling Machine Translation +2

Paper
Add Code

Improving Discriminative Learning for Zero-Shot Relation Extraction

no code implementations • SpaNLP (ACL) 2022 • Van-Hien Tran, Hiroki Ouchi, Taro Watanabe, Yuji Matsumoto

Zero-shot relation extraction (ZSRE) aims to predict target relations that cannot be observed during training.

Ranked #3 on Zero-shot Relation Classification on FewRel

Relation Relation Extraction +2

Paper
Add Code

Transliteration for Low-Resource Code-Switching Texts: Building an Automatic Cyrillic-to-Latin Converter for Tatar

no code implementations • NAACL (CALCS) 2021 • Chihiro Taguchi, Yusuke Sakai, Taro Watanabe

Given this situation, we proposed a transliteration method based on subword-level language identification.

Language Identification Transliteration

Paper
Add Code

Universal Dependencies Treebank for Tatar: Incorporating Intra-Word Code-Switching Information

no code implementations • EURALI (LREC) 2022 • Chihiro Taguchi, Sei Iwata, Taro Watanabe

Experimenting on NMCTT and the Turkish-German CS treebank (SAGT), we demonstrate that the proposed annotation scheme introduced in NMCTT can improve the performance of the subword-level language identification.

Language Identification POS +1

Paper
Add Code

Sharing Parameter by Conjugation for Knowledge Graph Embeddings in Complex Space

1 code implementation • COLING (TextGraphs) 2022 • Xincan Feng, Zhi Qu, Yuchang Cheng, Taro Watanabe, Nobuhiro Yugami

A Knowledge Graph (KG) is the directed graphical representation of entities and relations in the real world.

Knowledge Graph Embedding Knowledge Graph Embeddings

Paper
Code

Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair

no code implementations • 18 Apr 2024 • Yusuke Sakai, Mana Makinae, Hidetaka Kamigaito, Taro Watanabe

In Simultaneous Machine Translation (SiMT) systems, training with a simultaneous interpretation (SI) corpus is an effective method for achieving high-quality yet low-latency systems.

Machine Translation Translation

Paper
Add Code

JDocQA: Japanese Document Question Answering Dataset for Generative Language Models

no code implementations • 28 Mar 2024 • Eri Onami, Shuhei Kurita, Taiki Miyanishi, Taro Watanabe

Document question answering is a task of question answering on given documents such as reports, slides, pamphlets, and websites, and it is a truly demanding task as paper and electronic forms of documents are so common in our society.

Hallucination Question Answering +1

Paper
Add Code

Cross-lingual Contextualized Phrase Retrieval

1 code implementation • 25 Mar 2024 • Huayang Li, Deng Cai, Zhi Qu, Qu Cui, Hidetaka Kamigaito, Lemao Liu, Taro Watanabe

In our work, we propose a new task formulation of dense retrieval, cross-lingual contextualized phrase retrieval, which aims to augment cross-lingual applications by addressing polysemy using context information.

Contrastive Learning Language Modelling +4

Paper
Code

Distilling Named Entity Recognition Models for Endangered Species from Large Language Models

no code implementations • 13 Mar 2024 • Jesse Atuhurra, Seiveright Cargill Dujohn, Hidetaka Kamigaito, Hiroyuki Shindo, Taro Watanabe

Natural language processing (NLP) practitioners are leveraging large language models (LLM) to create structured datasets from semi-structured and unstructured data sources such as patents, papers, and theses, without having domain-specific knowledge.

In-Context Learning Knowledge Distillation +5

Paper
Add Code

Artwork Explanation in Large-scale Vision Language Models

no code implementations • 29 Feb 2024 • Kazuki Hayashi, Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

To address this issue, we propose a new task: the artwork explanation generation task, along with its evaluation dataset and metric for quantitatively assessing the understanding and utilization of knowledge about artworks.

Explanation Generation Text Generation

Paper
Add Code

Do LLMs Implicitly Determine the Suitable Text Difficulty for Users?

1 code implementation • 22 Feb 2024 • Seiji Gobara, Hidetaka Kamigaito, Taro Watanabe

Experimental results on the Stack-Overflow dataset and the TSCC dataset, including multi-turn conversation show that LLMs can implicitly handle text difficulty between user input and its generated response.

Question Answering

Paper
Code

Evaluating Image Review Ability of Vision Language Models

no code implementations • 19 Feb 2024 • Shigeki Saito, Kazuki Hayashi, Yusuke Ide, Yusuke Sakai, Kazuma Onishi, Toma Suzuki, Seiji Gobara, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

Large-scale vision language models (LVLMs) are language models that are capable of processing images and text inputs by a single model.

Image Captioning

Paper
Add Code

Centroid-Based Efficient Minimum Bayes Risk Decoding

no code implementations • 17 Feb 2024 • Hiroyuki Deguchi, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe, Hideki Tanaka, Masao Utiyama

Minimum Bayes risk (MBR) decoding achieved state-of-the-art translation performance by using COMET, a neural metric that has a high correlation with human evaluation.

Translation

Paper
Add Code

Generating Diverse Translation with Perturbed kNN-MT

no code implementations • 14 Feb 2024 • Yuto Nishida, Makoto Morishita, Hidetaka Kamigaito, Taro Watanabe

Generating multiple translation candidates would enable users to choose the one that satisfies their needs.

Machine Translation Translation

Paper
Add Code

Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?

no code implementations • 15 Nov 2023 • Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

Knowledge Graph Completion (KGC) is a task that infers unseen relationships between entities in a KG.

Language Modelling Memorization

Paper
Add Code

knn-seq: Efficient, Extensible kNN-MT Framework

1 code implementation • 18 Oct 2023 • Hiroyuki Deguchi, Hayate Hirano, Tomoki Hoshino, Yuto Nishida, Justin Vasselli, Taro Watanabe

We publish our knn-seq as an MIT-licensed open-source project and the code is available on https://github. com/naist-nlp/knn-seq .

Machine Translation NMT +1

Paper
Code

Model-based Subsampling for Knowledge Graph Completion

1 code implementation • 17 Sep 2023 • Xincan Feng, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

Subsampling is effective in Knowledge Graph Embedding (KGE) for reducing overfitting caused by the sparsity in Knowledge Graph (KG) datasets.

Knowledge Graph Completion Knowledge Graph Embedding

Paper
Code

TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wild

1 code implementation • 14 Sep 2023 • Huayang Li, Siheng Li, Deng Cai, Longyue Wang, Lemao Liu, Taro Watanabe, Yujiu Yang, Shuming Shi

We release our dataset, model, and demo to foster future research in the area of multimodal instruction following.

Ranked #89 on Visual Question Answering on MM-Vet

Instruction Following Language Modelling +1

Paper
Code

Japanese Lexical Complexity for Non-Native Readers: A New Dataset

2 code implementations • 30 Jun 2023 • Yusuke Ide, Masato Mita, Adam Nohejl, Hiroki Ouchi, Taro Watanabe

Lexical complexity prediction (LCP) is the task of predicting the complexity of words in a text on a continuous scale.

Lexical Complexity Prediction

Paper
Code

Enhancing Semantic Correlation between Instances and Relations for Zero-Shot Relation Extraction

1 code implementation • Journal of Natural Language Processing 2023 • Van-Hien Tran, Hiroki Ouchi, Hiroyuki Shindo, Yuji Matsumoto, Taro Watanabe

This study argues that enhancing the semantic correlation between instances and relations is key to effectively solving the zero-shot relation extraction task.

Ranked #1 on Zero-shot Relation Classification on FewRel

Relation Relation Extraction +1

Paper
Code

Second Language Acquisition of Neural Language Models

1 code implementation • 5 Jun 2023 • Miyu Oba, Tatsuki Kuribayashi, Hiroki Ouchi, Taro Watanabe

With the success of neural language models (LMs), their language acquisition has gained much attention.

Cross-Lingual Transfer Language Acquisition

Paper
Code

Table and Image Generation for Investigating Knowledge of Entities in Pre-trained Vision and Language Models

1 code implementation • 3 Jun 2023 • Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

This task consists of two parts: the first is to generate a table containing knowledge about an entity and its related image, and the second is to generate an image from an entity with a caption and a table containing related knowledge of the entity.

Image Generation

Paper
Code

Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation

1 code implementation • 23 May 2023 • Shohei Higashiyama, Hiroki Ouchi, Hiroki Teranishi, Hiroyuki Otomo, Yusuke Ide, Aitaro Yamamoto, Hiroyuki Shindo, Yuki Matsuda, Shoko Wakamiya, Naoya Inoue, Ikuya Yamada, Taro Watanabe

Geoparsing is a fundamental technique for analyzing geo-entity information in text.

Paper
Code

Arukikata Travelogue Dataset

no code implementations • 19 May 2023 • Hiroki Ouchi, Hiroyuki Shindo, Shoko Wakamiya, Yuki Matsuda, Naoya Inoue, Shohei Higashiyama, Satoshi Nakamura, Taro Watanabe

We have constructed Arukikata Travelogue Dataset and released it free of charge for academic research.

Paper
Add Code

Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning

1 code implementation • 6 Dec 2022 • Ukyo Honda, Taro Watanabe, Yuji Matsumoto

Discriminativeness is a desirable feature of image captions: captions should describe the characteristic details of input images.

Image Captioning reinforcement-learning +1

Paper
Code

$N$-gram Is Back: Residual Learning of Neural Text Generation with $n$-gram Language Model

1 code implementation • 26 Oct 2022 • Huayang Li, Deng Cai, Jin Xu, Taro Watanabe

The combination of $n$-gram and neural LMs not only allows the neural part to focus on the deeper understanding of language but also provides a flexible way to customize an LM by switching the underlying $n$-gram model without changing the neural model.

Domain Adaptation Language Modelling +2

Paper
Code

Adapting to Non-Centered Languages for Zero-shot Multilingual Translation

1 code implementation • COLING 2022 • Zhi Qu, Taro Watanabe

Multilingual neural machine translation can translate unseen language pairs during training, i. e. zero-shot translation.

Machine Translation Translation

Paper
Code

Visualizing the Relationship Between Encoded Linguistic Information and Task Performance

no code implementations • Findings (ACL) 2022 • Jiannan Xiang, Huayang Li, Defu Lian, Guoping Huang, Taro Watanabe, Lemao Liu

To this end, we study the dynamic relationship between the encoded linguistic information and task performance from the viewpoint of Pareto Optimality.

Language Modelling Machine Translation

Paper
Add Code

Improved Decomposition Strategy for Joint Entity and Relation Extraction

no code implementations • Journal of Natural Language Processing 2021 • Van-Hien Tran, Van-Thuy Phi, Akihiko Kato, Hiroyuki Shindo, Taro Watanabe, Yuji Matsumoto

A recent study (Yu et al. 2020) proposed a novel decomposition strategy that splits the task into two interrelated subtasks: detection of the head-entity (HE) and identification of the corresponding tail-entity and relation (TER) for each extracted head-entity.

Joint Entity and Relation Extraction Relation +1

Paper
Add Code

Transductive Data Augmentation with Relational Path Rule Mining for Knowledge Graph Embedding

no code implementations • 1 Nov 2021 • Yushi Hirose, Masashi Shimbo, Taro Watanabe

For knowledge graph completion, two major types of prediction models exist: one based on graph embeddings, and the other based on relation path rule induction.

Data Augmentation Knowledge Graph Completion +2

Paper
Add Code

Dependency Patterns of Complex Sentences and Semantic Disambiguation for Abstract Meaning Representation Parsing

no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Yuki Yamamoto, Yuji Matsumoto, Taro Watanabe

Abstract Meaning Representation (AMR) is a sentence-level meaning representation based on predicate argument structure.

AMR Parsing Sentence

Paper
Add Code

Neural Machine Translation with Synchronous Latent Phrase Structure

no code implementations • ACL 2021 • Shintaro Harada, Taro Watanabe

It is reported that grammatical information is useful for machine translation (MT) task.

Constituency Parsing Machine Translation +1

Paper
Add Code

Nested Named Entity Recognition via Explicitly Excluding the Influence of the Best Path

1 code implementation • ACL 2021 • Yiran Wang, Hiroyuki Shindo, Yuji Matsumoto, Taro Watanabe

This paper presents a novel method for nested named entity recognition.

Ranked #12 on Nested Named Entity Recognition on GENIA

named-entity-recognition Named Entity Recognition +1

Paper
Code

Zero Pronouns Identification based on Span prediction

no code implementations • ACL 2021 • Sei Iwata, Taro Watanabe, Masaaki Nagata

In the experiments, our model surpassed the sequence labeling baseline.

Paper
Add Code

Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning

1 code implementation • EACL 2021 • Ukyo Honda, Yoshitaka Ushiku, Atsushi Hashimoto, Taro Watanabe, Yuji Matsumoto

Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and object labels detected from the images.

Image Captioning image-sentence alignment +2

Paper
Code

User-Generated Text Corpus for Evaluating Japanese Morphological Analysis and Lexical Normalization

1 code implementation • NAACL 2021 • Shohei Higashiyama, Masao Utiyama, Taro Watanabe, Eiichiro Sumita

Morphological analysis (MA) and lexical normalization (LN) are both important tasks for Japanese user-generated text (UGT).

Lexical Normalization Morphological Analysis

Paper
Code

Demystifying Learning of Unsupervised Neural Machine Translation

no code implementations • 1 Jan 2021 • Guanlin Li, Lemao Liu, Taro Watanabe, Conghui Zhu, Tiejun Zhao

Unsupervised Neural Machine Translation or UNMT has received great attention in recent years.

Machine Translation Translation

Paper
Add Code

Coordination Boundary Identification without Labeled Data for Compound Terms Disambiguation

no code implementations • COLING 2020 • Yuya Sawada, Takashi Wada, Takayoshi Shibahara, Hiroki Teranishi, Shuhei Kondo, Hiroyuki Shindo, Taro Watanabe, Yuji Matsumoto

We propose a simple method for nominal coordination boundary identification.

Word Embeddings

Paper
Add Code

Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection

no code implementations • WS 2018 • Wei Wang, Taro Watanabe, Macduff Hughes, Tetsuji Nakagawa, Ciprian Chelba

Measuring domain relevance of data and identifying or selecting well-fit domain data for machine translation (MT) is a well-studied topic, but denoising is not yet.

Denoising Machine Translation +2

Paper
Add Code

Phrase-based Machine Translation using Multiple Preordering Candidates

no code implementations • COLING 2016 • Yusuke Oda, Taku Kudo, Tetsuji Nakagawa, Taro Watanabe

In this paper, we propose a new decoding method for phrase-based statistical machine translation which directly uses multiple preordering candidates as a graph structure.

Machine Translation Translation