TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Text Clustering	Urdu News Headlines Dataset	Vector Space Model	Related Headlines	85	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/clustering-urdu-news-using-headlines/text-clustering-on-urdu-news-headlines)](https://paperswithcode.com/sota/text-clustering-on-urdu-news-headlines?p=clustering-urdu-news-using-headlines)`

Clustering Urdu News Using Headlines

23 2015 · Samia Khaliq, Waheed Iqbal, Faisal Bukhari, Kamran Malik ·

This paper that proposes and evaluates a new algorithm to automatically cluster Urdu news from different news agencies. The task is challenging because there are no language processing libraries for the Urdu language. The authors' experimental dataset consists of news from famous Pakistani media houses, including Jang, BBC Urdu, Express, UrduPoint, and Voice of America Urdu (VOA). The proposed algorithm only uses headlines to cluster the news. The authors argue that news headlines provide a concise summary of the news, which motivates them to use it instead of using the entire news story. Their experimental evaluation shows micro and macro averages for precision of 0.45 and 0.48 respectively for identifying similar news using headlines.

PDF