Reproducibility and Automation of the Appraisal Taxonomy

There is a lack of reproducibility in results from experiments that apply the Appraisal taxonomy. Appraisal is widely used by linguists to study how people judge things or people. Automating Appraisal could be beneficial for use cases such as moderating online comments. Past work in Appraisal annotation has been descriptive in nature and, the lack of publicly available data sets hinders the progress of automation. In this work, we are interested in two things; first, measuring the performance of automated approaches to Appraisal classification in the publicly available Australasian Language Technology Association (ALTA) Shared Task Challenge data set. Second, we are interested in reproducing the annotation of the ALTA data set. Four additional annotators, each with a different linguistics background, were employed to re-annotate the data set. Our results show a poor level of agreement at more detailed Appraisal categories (Fleiss Kappa = 0.059) and a fair level of agreement (Kappa = 0.372) at coarse-level categories. We find similar results when using automated approaches that are available publicly. Our empirical evidence suggests that at present, automating classification is practical only when considering coarse-level categories of the taxonomy.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here