Search Results for author: Atsuki Yamaguchi

Found 11 papers, 7 papers with code

An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Generative LLM Inference

1 code implementation16 Feb 2024 Atsuki Yamaguchi, Aline Villavicencio, Nikolaos Aletras

We also show that adapting LLMs that have been pre-trained on more balanced multilingual data results in downstream performance comparable to the original models.

Natural Language Understanding

appjsonify: An Academic Paper PDF-to-JSON Conversion Toolkit

1 code implementation2 Oct 2023 Atsuki Yamaguchi, Terufumi Morishita

We present appjsonify, a Python-based PDF-to-JSON conversion toolkit for academic papers.

Document Layout Analysis

Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic

1 code implementation11 Aug 2023 Terufumi Morishita, Gaku Morio, Atsuki Yamaguchi, Yasuhiro Sogawa

We rethink this and adopt a well-grounded set of deduction rules based on formal logic theory, which can derive any other deduction rules when combined in a multistep way.

Formal Logic Logical Reasoning

How do different tokenizers perform on downstream tasks in scriptio continua languages?: A case study in Japanese

1 code implementation16 Jun 2023 Takuro Fujii, Koki Shibata, Atsuki Yamaguchi, Terufumi Morishita, Yasuhiro Sogawa

This paper investigates the effect of tokenizers on the downstream performance of pretrained language models (PLMs) in scriptio continua languages where no explicit spaces exist between words, using Japanese as a case study.

How does the task complexity of masked pretraining objectives affect downstream performance?

1 code implementation18 May 2023 Atsuki Yamaguchi, Hiroaki Ozaki, Terufumi Morishita, Gaku Morio, Yasuhiro Sogawa

Masked language modeling (MLM) is a widely used self-supervised pretraining objective, where a model needs to predict an original token that is replaced with a mask given contexts.

Language Modelling Masked Language Modeling

Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News

no code implementations3 Mar 2023 Yuta Koreeda, Ken-ichi Yokote, Hiroaki Ozaki, Atsuki Yamaguchi, Masaya Tsunokake, Yasuhiro Sogawa

Based on the multilingual, multi-task nature of the task and the low-resource setting, we investigated different cross-lingual and multi-task strategies for training the pretrained language models.

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

1 code implementation EMNLP 2021 Atsuki Yamaguchi, George Chrysostomou, Katerina Margatina, Nikolaos Aletras

Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations.

Language Modelling Masked Language Modeling +1

Dialogue Act-based Breakdown Detection in Negotiation Dialogues

1 code implementation EACL 2021 Atsuki Yamaguchi, Kosui Iwasa, Katsuhide Fujita

Thanks to the success of goal-oriented negotiation dialogue systems, studies of negotiation dialogue have gained momentum in terms of both human-human negotiation support and dialogue systems.

Cannot find the paper you are looking for? You can Submit a new open access paper.