Search Results for author: Shijie Wu

Found 23 papers, 14 papers with code

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

1 code implementation26 May 2023 Shiyue Zhang, Shijie Wu, Ozan Irsoy, Steven Lu, Mohit Bansal, Mark Dredze, David Rosenberg

Autoregressive language models are trained by minimizing the cross-entropy of the model distribution Q relative to the data distribution P -- that is, minimizing the forward cross-entropy, which is equivalent to maximum likelihood estimation (MLE).

Overcoming Catastrophic Forgetting in Massively Multilingual Continual Learning

no code implementations25 May 2023 Genta Indra Winata, Lingjue Xie, Karthik Radhakrishnan, Shijie Wu, Xisen Jin, Pengxiang Cheng, Mayank Kulkarni, Daniel Preotiuc-Pietro

Real-life multilingual systems should be able to efficiently incorporate new languages as data distributions fed to the system evolve and shift over time.

Continual Learning Scheduling

BloombergGPT: A Large Language Model for Finance

no code implementations30 Mar 2023 Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, Sebastian Gehrmann, Prabhanjan Kambadur, David Rosenberg, Gideon Mann

The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering.

Causal Judgment Date Understanding +21

BoundaryFace: A mining framework with noise label self-correction for Face Recognition

1 code implementation10 Oct 2022 Shijie Wu, Xun Gong

Specifically, a closed-set noise label self-correction module is put forward, making this framework work well on datasets containing a lot of label noise.

Face Recognition

How Do Multilingual Encoders Learn Cross-lingual Representation?

no code implementations12 Jul 2022 Shijie Wu

We also look at how to inject different cross-lingual signals into multilingual encoders, and the optimization behavior of cross-lingual transfer with these models.

Cross-Lingual Transfer Multilingual NLP +1

Zero-shot Cross-lingual Transfer is Under-specified Optimization

1 code implementation RepL4NLP (ACL) 2022 Shijie Wu, Benjamin Van Durme, Mark Dredze

Pretrained multilingual encoders enable zero-shot cross-lingual transfer, but often produce unreliable models that exhibit high performance variance on the target language.

Zero-Shot Cross-Lingual Transfer

Differentiable Generative Phonology

1 code implementation10 Feb 2021 Shijie Wu, Edoardo Maria Ponti, Ryan Cotterell

As the main contribution of our work, we implement the phonological generative system as a neural model differentiable end-to-end, rather than as a set of rules or constraints.

Do Explicit Alignments Robustly Improve Multilingual Encoders?

1 code implementation EMNLP 2020 Shijie Wu, Mark Dredze

Multilingual BERT (mBERT), XLM-RoBERTa (XLMR) and other unsupervised multilingual encoders can effectively learn cross-lingual representation.

The SIGMORPHON 2020 Shared Task on Multilingual Grapheme-to-Phoneme Conversion

no code implementations WS 2020 Kyle Gorman, Lucas F.E. Ashby, Aaron Goyzueta, Arya McCarthy, Shijie Wu, Daniel You

We describe the design and findings of the SIGMORPHON 2020 shared task on multilingual grapheme-to-phoneme conversion.

Applying the Transformer to Character-level Transduction

2 code implementations EACL 2021 Shijie Wu, Ryan Cotterell, Mans Hulden

The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks.

Morphological Inflection Transliteration

Are All Languages Created Equal in Multilingual BERT?

1 code implementation WS 2020 Shijie Wu, Mark Dredze

Multilingual BERT (mBERT) trained on 104 languages has shown surprisingly good cross-lingual performance on several NLP tasks, even without explicit cross-lingual signals.

Cross-Lingual Transfer Dependency Parsing +4

The Paradigm Discovery Problem

1 code implementation ACL 2020 Alexander Erdmann, Micha Elsner, Shijie Wu, Ryan Cotterell, Nizar Habash

Our benchmark system first makes use of word embeddings and string similarity to cluster forms by cell and by paradigm.

Clustering Word Embeddings

Emerging Cross-lingual Structure in Pretrained Language Models

no code implementations ACL 2020 Shijie Wu, Alexis Conneau, Haoran Li, Luke Zettlemoyer, Veselin Stoyanov

We study the problem of multilingual masked language modeling, i. e. the training of a single model on concatenated text from multiple languages, and present a detailed study of several factors that influence why these models are so effective for cross-lingual transfer.

Cross-Lingual Transfer Language Modelling +2

The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection

no code implementations WS 2019 Arya D. McCarthy, Ekaterina Vylomova, Shijie Wu, Chaitanya Malaviya, Lawrence Wolf-Sonkin, Garrett Nicolai, Christo Kirov, Miikka Silfverberg, Sabrina J. Mielke, Jeffrey Heinz, Ryan Cotterell, Mans Hulden

The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual analysis in morphology examined transfer learning of inflection between 100 language pairs, as well as contextual lemmatization and morphosyntactic description in 66 languages.

Cross-Lingual Transfer Lemmatization +3

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT

2 code implementations IJCNLP 2019 Shijie Wu, Mark Dredze

Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2018) have pushed forward the state-of-the-art on many NLP tasks.

Cross-Lingual NER Dependency Parsing +6

Hard Non-Monotonic Attention for Character-Level Transduction

2 code implementations EMNLP 2018 Shijie Wu, Pamela Shapiro, Ryan Cotterell

We compare soft and hard non-monotonic attention experimentally and find that the exact algorithm significantly improves performance over the stochastic approximation and outperforms soft attention.

Hard Attention Image Captioning

Cannot find the paper you are looking for? You can Submit a new open access paper.