Large Scale Purchase Prediction with Historical User Actions on B2C Online Retail Platform

27 Aug 2014  ·  Yuyu Zhang, Liang Pang, Lei Shi, Bin Wang ·

This paper describes the solution of Bazinga Team for Tmall Recommendation Prize 2014. With real-world user action data provided by Tmall, one of the largest B2C online retail platforms in China, this competition requires to predict future user purchases on Tmall website. Predictions are judged on F1Score, which considers both precision and recall for fair evaluation. The data set provided by Tmall contains more than half billion action records from over ten million distinct users. Such massive data volume poses a big challenge, and drives competitors to write every single program in MapReduce fashion and run it on distributed cluster. We model the purchase prediction problem as standard machine learning problem, and mainly employ regression and classification methods as single models. Individual models are then aggregated in a two-stage approach, using linear regression for blending, and finally a linear ensemble of blended models. The competition is approaching the end but still in running during writing this paper. In the end, our team achieves F1Score 6.11 and ranks 7th (out of 7,276 teams in total).

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods