Automated static code analysis for classifying android applications using machine learning

Asaf Shabtai, Yuval Fledel, Yuval Elovici

2010 international conference on computational intelligence and security …, 2010

In this paper we apply Machine Learning (ML) techniques on static features that are extracted from Android’s application files for the classification of the files. Features are extracted from Android’s Java byte-code (i.e.,.dex files) and other file types such as XML-files. Our evaluation focused on classifying two types of Android applications: tools and games. Successful differentiation between games and tools is expected to provide positive indication about the ability of such methods to learn and model Android benign applications and potentially detect malware files. The results of an evaluation, performed using a test collection comprising 2,285 Android .apk files, indicate that features, extracted statically from .apk files, coupled with ML classification algorithms can provide good indication about the nature of an Android application without running the application, and may assist in detecting malicious applications. This …