Mirsky, A Shabtai, L Rokach, B Shapira, Y Elovici
Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security
In this paper we describe and share with the research community,a significant smartphone dataset obtained from anongoing long-term data collection experiment. The datasetcurrently contains 10 billion data records from 30 users collectedover a period of 1.6 years and an additional 20 users for6 months (totaling 50 active users currently participating inthe experiment).The experiment involves two smartphone agents: SherLockand Moriarty. SherLock collects a wide variety of software andsensor data at a high sample rate. Moriarty perpetrates variousattacks on the user and logs its activities, thus providinglabels for the SherLock dataset.The primary purpose of the dataset is to help security professionalsand academic researchers in developing innovativemethods of implicitly detecting malicious behavior in smartphones.Specifically, from data obtainable without superuser(root) privileges. To demonstrate possible uses of the dataset,we perform a basic malware analysis and evaluate a methodof continuous user authentication.