Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy

16 Jun 2016  ·  Thông T. Nguyên, Xiaokui Xiao, Yin Yang, Siu Cheung Hui, Hyejin Shin, Junbum Shin ·

Organizations with a large user base, such as Samsung and Google, can potentially benefit from collecting and mining users' data. However, doing so raises privacy concerns, and risks accidental privacy breaches with serious consequences. Local differential privacy (LDP) techniques address this problem by only collecting randomized answers from each user, with guarantees of plausible deniability; meanwhile, the aggregator can still build accurate models and predictors by analyzing large amounts of such randomized data. So far, existing LDP solutions either have severely restricted functionality, or focus mainly on theoretical aspects such as asymptotical bounds rather than practical usability and performance. Motivated by this, we propose Harmony, a practical, accurate and efficient system for collecting and analyzing data from smart device users, while satisfying LDP. Harmony applies to multi-dimensional data containing both numerical and categorical attributes, and supports both basic statistics (e.g., mean and frequency estimates), and complex machine learning tasks (e.g., linear regression, logistic regression and SVM classification). Experiments using real data confirm Harmony's effectiveness.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Databases

Datasets


  Add Datasets introduced or used in this paper