Dynamic and Static Analysis for APKs

Leveraging static and dynamic analysis feature sets for comprehensive APK analysis.

1 min read


Table of Contents

[ + ] Overview
[ + ] Dataset
[ + ] Models
[ + ] Results
[ + ] Future Work
[ + ] References

[ + ]Overview

This research presents a comprehensive approach to analyzing Android Package Kits (APKs) using both dynamic and static analysis methods. The primary objective is to identify and categorize APKs into various types, including adware, ransomware, scareware, and benign applications. The study utilizes a dataset derived from the CIC-AndMal2017 and CICMalDroid 2020 datasets, enriched with dynamic and static features. Dynamic analysis is performed based on system calls at runtime, while static analysis involves extracting features using Androguard. The paper discusses the process of dataset creation, challenges faced, and the reasons for selecting a subset of the available samples. Various machine learning models such as XGBoost, Decision Tree, and SVM Classifier have been trained and optimized to classify the APKs, and their performance is evaluated and compared. The research also outlines future directions for enhancing the dataset and refining the analysis methods.

[ + ]Dataset

The dataset, crucial to this research, was constructed due to challenges in finding comprehensive datasets encompassing both dynamic and static features of APKs. Raw APKs were sourced from “CIC-AndMal2017” [1] and “CICMalDroid 2020” [2]. The dynamic analysis dataset is based on system calls during runtime, gathered using “Automated APK Tracing” [3]. Conversely, the static dataset comprises various features extracted via Androguard [4]. The final dataset, a subset of all samples from the aforementioned sources, is available in the CSV format and encompasses 1,357 samples, with a focus on balancing between malicious (220 samples) and benign (1,137 samples) APKs.

Note: Will not be uploading raw samples used for building CSVs in dataset. These can be found at the following locations:

Below is a summary of the data in the dataset:

DatapointResult
APK TypesAdware, Ransomware, Scareware, Benign
Dataset Size1,357
Malicious Samples220 (16.2%)
Benign Samples1,137 (83.8%)
Total Available Samples22,832

[ + ]Models

The study employs three machine learning models: XGBoost Classifier, Decision Tree Classifier augmented with Ada Boost, and Support Vector Classifier (SVC). Each model underwent a rigorous parameter optimization process using Grid Search, resulting in the identification of the most effective parameters for classification accuracy.

Best parameters for the XGBoost were:

  • Column Sample By Tree - 0.6
  • Gamma - 1
  • Max Depth - 5
  • Min Child Weight - 1
  • Sub-sample - 0.8

Best parameters for the Decision Tree were:

  • Criterion - Gini
  • Max Depth - 6
  • Min Samples Leaf - 5
  • Min Samples Split - 90

Best parameters for the SVC were:

  • C - 10
  • Gamma - 0.0001
  • Kernel - Radial Basis Function

[ + ]Results

XGBoost

PrecisionRecallF1-Score
Benign0.9786320.9913410.984946
Malicious0.9473680.8780480.911392
Accuracy0.9742640.9742640.974264
Macro Avg0.9630000.9346950.948169
Weighted Avg0.9739190.9742640.973859

Top 15 Most Important Features

Decision Tree

PrecisionRecallF1-Score
Benign0.9663860.9956700.980810
Malicious0.9705880.8048780.880000
Accuracy0.9669110.9669110.966911
Macro Avg0.9684870.9002740.930405
Weighted Avg0.9670190.9669110.965614

Top 15 Most Important Features

SVM Classifier

PrecisionRecallF1-Score
Benign0.8750001.0000000.933333
Malicious1.0000000.1951210.326530
Accuracy0.8786760.8786760.878676
Macro Avg0.9375000.5975600.629931
Weighted Avg0.8938410.8786760.841866

Summary

Model Summary for Benign Samples

Model Summary for Malicious Samples

[ + ]Future Work

  • Capture network traffic and implement features.
  • Review opportunities for unsupervised learning, which may lead to lineage inferencing.
  • Improve dataset by using all 22,832 samples in order to reduce over-fitting.
  • Improve dynamic analysis features extracted / collected for more balanced feature weights and potential model improvements.
  • Improve static analysis features extracted / collected for potential model improvements.

[ + ]References

[1] CIC, "CICAndMal2017 Dataset," University of New Brunswick's Canadian Institute for Cybersecurity. [Online]. Available: https://www.unb.ca/cic/datasets/andmal2017.html.

[2] CIC, "CICMalDroid 2020 Dataset," University of New Brunswick's Canadian Institute for Cybersecurity. [Online]. Available: https://www.unb.ca/cic/datasets/maldroid-2020.html.

[3] R. Khoury, "Automated APK Tracing," GitHub Repository. [Online]. Available: https://github.com/RaphaelKhoury/automatedapk-tracing.

[4] Androguard, “Androguard/androguard: Reverse engineering and Pentesting for Android Applications,” GitHub, https://github.com/androguard/androguard