Synapse at CAp 2017 NER challenge: Fasttext CRF

14 Sep 2017 · Damien Sileo, Camille Pradel, Philippe Muller, Tim Van De Cruys ·

We present our system for the CAp 2017 NER challenge which is about named entity recognition on French tweets. Our system leverages unsupervised learning on a larger dataset of French tweets to learn features feeding a CRF model. It was ranked first without using any gazetteer or structured external data, with an F-measure of 58.89\%. To the best of our knowledge, it is the first system to use fasttext embeddings (which include subword representations) and an embedding-based sentence representation for NER.

PDF Abstract