“Where is My Parcel?” Fast and Efficient Classifiers to Detect User Intent in Natural Language

We study the performance of customer intent classifiers designed to predict the most popular intent received through ASOS.com Customer Care Department, namely “Where is my order?”. These queries are characterised by the use of colloquialism, label noise and short message length. We conduct extensive experiments with two well established classification models: logistic regression via n-grams to account for sequences in the data and recurrent neural networks that perform the extraction of these sequential patterns automatically. Maintaining the embedding layer fixed to GloVe coordinates, a Mann-Whitney U test indicated that the F1 score on a held out set of messages was lower for recurrent neural network classifiers than for linear n-grams classifiers (M1=0.828, M2=0.815; U=1,196, P=1.46e-20), unless all layers were jointly trained with all other network parameters (M1=0.831, M2=0.828, U=4,280, P=8.24e-4). This plain neural network produced top performance on a denoised set of labels (0.887 F1) matching with Human annotators (0.889 F1) and superior to linear classifiers (0.865 F1). Calibrating these models to achieve precision levels above Human performance (0.93 Precision), our results indicate a small difference in Recall of 0.05 for the plain neural networks (training under 1hr), and 0.07 for the linear n-grams (training under 10min), revealing the latter as a judicious choice of model architecture in modern AI production systems.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Intent Detection ASOS.com user intent plain-LSTM F1 0.887 # 1
Intent Detection ASOS.com user intent glove-LSTM F1 0.856 # 3
Intent Detection ASOS.com user intent linear-Ngrams F1 0.865 # 2

Methods