Search Results for author: Naoki Yoshinaga

Found 36 papers, 6 papers with code

Building Large-Scale Japanese Pronunciation-Annotated Corpora for Reading Heteronymous Logograms

no code implementations • LREC 2022 • Fumikazu Sato, Naoki Yoshinaga, Masaru Kitsuregawa

In this study, to improve the accuracy of pronunciation prediction, we construct two large-scale Japanese corpora that annotate kanji characters with their pronunciations.

Sentence

Paper
Add Code

Exploratory Model Analysis Using Data-Driven Neuron Representations

no code implementations • EMNLP (BlackboxNLP) 2021 • Daisuke Oba, Naoki Yoshinaga, Masashi Toyoda

Probing classifiers have been extensively used to inspect whether a model component captures specific linguistic phenomena.

Paper
Add Code

Speculative Sampling in Variational Autoencoders for Dialogue Response Generation

1 code implementation • Findings (EMNLP) 2021 • Shoetsu Sato, Naoki Yoshinaga, Masashi Toyoda, Masaru Kitsuregawa

Our method chooses the most probable one from redundantly sampled latent variables for tying up the variable with a given response.

Response Generation

Paper
Code

Fine-grained Typing of Emerging Entities in Microblogs

no code implementations • Findings (EMNLP) 2021 • Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda

Experiments on the Twitter datasets confirm the effectiveness of our typing model and the context selector.

Entity Typing

Paper
Add Code

Tracing the Roots of Facts in Multilingual Language Models: Independent, Shared, and Transferred Knowledge

1 code implementation • 8 Mar 2024 • Xin Zhao, Naoki Yoshinaga, Daisuke Oba

Acquiring factual knowledge for language models (LMs) in low-resource languages poses a serious challenge, thus resorting to cross-lingual transfer in multilingual LMs (ML-LMs).

Cross-Lingual Transfer Knowledge Probing +1

Paper
Code

Rethinking Response Evaluation from Interlocutor's Eye for Open-Domain Dialogue Systems

no code implementations • 4 Jan 2024 • Yuma Tsuta, Naoki Yoshinaga, Shoetsu Sato, Masashi Toyoda

Open-domain dialogue systems have started to engage in continuous conversations with humans.

Paper
Add Code

Summarization-based Data Augmentation for Document Classification

1 code implementation • 1 Dec 2023 • Yueguan Wang, Naoki Yoshinaga

Despite the prevalence of pretrained language models in natural language understanding tasks, understanding lengthy text such as document is still challenging due to the data sparseness problem.

Classification Data Augmentation +2

Paper
Code

PerPLM: Personalized Fine-tuning of Pretrained Language Models via Writer-specific Intermediate Learning and Prompts

no code implementations • 14 Sep 2023 • Daisuke Oba, Naoki Yoshinaga, Masashi Toyoda

The meanings of words and phrases depend not only on where they are used (contexts) but also on who use them (writers).

Language Modelling Masked Language Modeling

Paper
Add Code

A Unified Generative Approach to Product Attribute-Value Identification

no code implementations • 9 Jun 2023 • Keiji Shinzato, Naoki Yoshinaga, Yandi Xia, Wei-Te Chen

We finetune a pre-trained generative model, T5, to decode a set of attribute-value pairs as a target sequence from the given product text.

Attribute

Paper
Add Code

Back to Patterns: Efficient Japanese Morphological Analysis with Feature-Sequence Trie

no code implementations • 30 May 2023 • Naoki Yoshinaga

Accurate neural models are much less efficient than non-neural models and are useless for processing billions of social media posts or handling user queries in real time with a limited budget.

Morphological Analysis

Paper
Add Code

Commentary Generation from Data Records of Multiplayer Strategy Esports Game

no code implementations • 21 Dec 2022 • Zihan Wang, Naoki Yoshinaga

In this study, we therefore introduce the task of generating game commentaries from esports' data records.

Data-to-Text Generation Decoder

Paper
Add Code

Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge

no code implementations • 14 Oct 2022 • Kosuke Nishida, Naoki Yoshinaga, Kyosuke Nishida

Although named entity recognition (NER) helps us to extract domain-specific entities from text (e. g., artists in the music domain), it is costly to create a large amount of training data or a structured knowledge base to perform accurate NER in the target domain.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Early Discovery of Disappearing Entities in Microblogs

no code implementations • 13 Oct 2022 • Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda

The major challenge is detecting uncertain contexts of disappearing entities from noisy microblog posts.

Time Series Time Series Analysis +1

Paper
Add Code

Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product Attribute Extraction

no code implementations • ACL 2022 • Keiji Shinzato, Naoki Yoshinaga, Yandi Xia, Wei-Te Chen

A key challenge in attribute value extraction (AVE) from e-commerce sites is how to handle a large number of attributes for diverse products.

Attribute Attribute Extraction +2

Paper
Add Code

Vocabulary Adaptation for Domain Adaptation in Neural Machine Translation

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Shoetsu Sato, Jin Sakuma, Naoki Yoshinaga, Masashi Toyoda, Masaru Kitsuregawa

Prior to fine-tuning, our method replaces the embedding layers of the NMT model by projecting general word embeddings induced from monolingual data in a target domain onto a source-domain embedding space.

Domain Adaptation Machine Translation +3

Paper
Code

Robust Backed-off Estimation of Out-of-Vocabulary Embeddings

no code implementations • Findings of the Association for Computational Linguistics 2020 • Nobukazu Fukuda, Naoki Yoshinaga, Masaru Kitsuregawa

In this study, inspired by the processes for creating words from known words, we propose a robust method of estimating oov word embeddings by referring to pre-trained word embeddings for known words with similar surfaces to target oov words.

Word Embeddings Word Similarity

Paper
Add Code

Context-aware Decoder for Neural Machine Translation using a Target-side Document-Level Language Model

no code implementations • NAACL 2021 • Amane Sugiyama, Naoki Yoshinaga

Although many context-aware neural machine translation models have been proposed to incorporate contexts in translation, most of those models are trained end-to-end on parallel documents aligned in sentence-level.

Decoder Language Modelling +3

Paper
Add Code

uBLEU: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems

no code implementations • ACL 2020 • Tsuta Yuma, Naoki Yoshinaga, Masashi Toyoda

Experimental results on massive Twitter data confirmed that υBLEU is comparable to ΔBLEU in terms of its correlation with human judgment and that the state of the art automatic evaluation method, RUBER, is improved by integrating υBLEU.

Paper
Add Code

Vocabulary Adaptation for Distant Domain Adaptation in Neural Machine Translation

no code implementations • 30 Apr 2020 • Shoetsu Sato, Jin Sakuma, Naoki Yoshinaga, Masashi Toyoda, Masaru Kitsuregawa

Domain Adaptation Machine Translation +3

Paper
Add Code

Data augmentation using back-translation for context-aware neural machine translation

no code implementations • WS 2019 • Amane Sugiyama, Naoki Yoshinaga

A single sentence does not always convey information that is enough to translate it into other languages.

Data Augmentation Machine Translation +3

Paper
Add Code

On the Relation between Position Information and Sentence Length in Neural Machine Translation

no code implementations • CONLL 2019 • Masato Neishi, Naoki Yoshinaga

Although some approaches such as the attention mechanism have partially remedied the problem, we found that the current standard NMT model, Transformer, has difficulty in translating long sentences compared to the former standard, Recurrent Neural Network (RNN)-based model.

Machine Translation NMT +4

Paper
Add Code

Multilingual Model Using Cross-Task Embedding Projection

no code implementations • CONLL 2019 • Jin Sakuma, Naoki Yoshinaga

We present a method for applying a neural network trained on one (resource-rich) language for a given task to other (resource-poor) languages.

Cross-Lingual Word Embeddings Sentiment Analysis +2

Paper
Add Code

Early Discovery of Emerging Entities in Microblogs

no code implementations • 8 Jul 2019 • Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda

Keeping up to date on emerging entities that appear every day is indispensable for various applications, such as social-trend analysis and marketing research.

Marketing

Paper
Add Code

Modeling Personal Biases in Language Use by Inducing Personalized Word Embeddings

no code implementations • NAACL 2019 • Daisuke Oba, Naoki Yoshinaga, Shoetsu Sato, Satoshi Akasaki, Masashi Toyoda

In this study, we propose a method of modeling such personal biases in word meanings (hereafter, semantic variations) with personalized word embeddings obtained by solving a task on subjective text while regarding words used by different individuals as different words.

Multi-class Classification Multi-Task Learning +2

Paper
Add Code

Learning to Describe Unknown Phrases with Local and Global Contexts

no code implementations • NAACL 2019 • Shonosuke Ishiwatari, Hiroaki Hayashi, Naoki Yoshinaga, Graham Neubig, Shoetsu Sato, Masashi Toyoda, Masaru Kitsuregawa

When reading a text, it is common to become stuck on unfamiliar words and phrases, such as polysemous words with novel senses, rarely used idioms, internet slang, or emerging entities.

Decoder

Paper
Add Code

Learning to Describe Phrases with Local and Global Contexts

1 code implementation • 1 Nov 2018 • Shonosuke Ishiwatari, Hiroaki Hayashi, Naoki Yoshinaga, Graham Neubig, Shoetsu Sato, Masashi Toyoda, Masaru Kitsuregawa

When reading a text, it is common to become stuck on unfamiliar words and phrases, such as polysemous words with novel senses, rarely used idioms, internet slang, or emerging entities.

Decoder Reading Comprehension