TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Aspect-Based Sentiment Analysis (ABSA)	SemEval 2014 Task 4 Subtask 1+2	gpt-3.5 finetuned	F1	83.76	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/large-language-models-for-aspect-based/aspect-based-sentiment-analysis-on-semeval-6)](https://paperswithcode.com/sota/aspect-based-sentiment-analysis-on-semeval-6?p=large-language-models-for-aspect-based)`

Large language models for aspect-based sentiment analysis

27 Oct 2023 · Paul F. Simmering, Paavo Huoviala ·

Large language models (LLMs) offer unprecedented text completion capabilities. As general models, they can fulfill a wide range of roles, including those of more specialized models. We assess the performance of GPT-4 and GPT-3.5 in zero shot, few shot and fine-tuned settings on the aspect-based sentiment analysis (ABSA) task. Fine-tuned GPT-3.5 achieves a state-of-the-art F1 score of 83.8 on the joint aspect term extraction and polarity classification task of the SemEval-2014 Task 4, improving upon InstructABSA [@scaria_instructabsa_2023] by 5.7%. However, this comes at the price of 1000 times more model parameters and thus increased inference cost. We discuss the the cost-performance trade-offs of different models, and analyze the typical errors that they make. Our results also indicate that detailed prompts improve performance in zero-shot and few-shot settings but are not necessary for fine-tuned models. This evidence is relevant for practioners that are faced with the choice of prompt engineering versus fine-tuning when using LLMs for ABSA.

PDF Abstract