1. Introduction
Mobile cloud computing (MCC) refers to the concept of providing users flexible and reliable, anywhere and anytime access to data stored in the cloud. It combines the powerful resources of cloud computing (CC) and wireless network technologies [
1]. Mobile devices (MD) are particularly popular with MCC users because of their portability, small size, robust connectivity, and capability to house and operate different third-party applications. In their normal course of activities MD users may connect to a number of cloud sources. However, some of the connected networks may not be secure and may expose the applications (apps) that reside on the MD to security threats such as breaches of data integrity, confidentiality, and service availability [
2]. In addition, when personal MDs are used at the workplace there is a possibility of introducing security threats to corporate networks [
3].
Furthermore, many of the apps installed on MDs are inherently risky due to the high number of permissions they request to operate, e.g., access to user location data, user contacts, or photo gallery. Apps downloaded from reputable app markets such as Apple’s App Store and Google Play may not be malicious but if excessive permissions are granted, personal information, sensitive data and user privacy can be compromised [
4]. For example, when an end-user MCC devices are part of an Internet of Things (IoT) ecosystem, a malicious app that resides on a single device may affect negatively other connected devices. The threats to the IoT environment related to mobile app vulnerabilities have been mostly associated with devices using the Android mobile operating system [
5] due in part to the open publication policy that allows users to download apps from both official and non-official market stores. In turn, malware developers have shifted their attention to targeting apps that can be deployed in such open platforms [
6].
Android app developers do not always consider the potentially harmful effect of requesting multiple permissions for effective app operation and more specifically, how the requested permissions can be manipulated and misused to breach user privacy [
7]. In more recent versions, the Android permission system affords users a degree of control over granting permissions as this can be done at runtime rather than during installation. Nevertheless, granting access privileges at runtime does not solve the problem of a malicious app gaining access to sensitive personal information; users may not have the knowledge required to identify the permissions necessary for a particular app. For example, a game app may request permission to obtain access to user location data and to read the user’s contact list. If granted, this may lead to privacy leakage [
8].
Inter--app communication channel may also pose a risk. For example, apps that appear benign on their own may be capable of performing a malicious task when working together (malware collusion). Such potentially dangerous apps may be hard to detect [
9].
While earlier research in the area of app security investigated how to distinguish between malicious and benign apps [
10,
11,
12,
13,
14], more recent models and methods focus or evaluating the potential harmfulness of an app rather than classifying it as malicious or benign. For example, Feng et al. [
4] used app permissions and descriptions to determine the riskiness of an app. Similarly, Wang et al. [
15] proposed a framework that quantified app riskiness based on the permissions requested.
Alshehri et al. [
7] developed a model that measures the security risk of Android apps based on the permissions that the user approves The model (named PUREDroid) estimates the magnitude of the damage that might occur as a result of excessive permission granting. For each app resident on the device, PUREDroid first creates two orthonormal state vectors representing the permissions the app has and has not requested. Then it determines the risk score of each app, considering the number of times known benign and malicious apps have requested each of the permissions requested by the app. Higher scoring apps are deemed potentially malicious. However, the accuracy of the model is not high; benign apps that request excessive permissions will also receive a high-risk score and will be deemed potentially malicious.
Also, based on app permission analysis, Rashidi et al. [
16] proposed a risk assessment model named XDroid that monitors the resource usage of Android devices. Adopting a probabilistic approach (hidden Markov model), XDroid models app behavior and performs an adaptive assessment of the apps residing on the device. Users select the resources they want monitored; the system then alerts the user of suspicious activities related to the selected resources. As XDroid relies on user decisions, lack of user expertise may affect negatively the choice of resources to be monitored and thus, the system’s effectiveness.
The model proposed by Jing et al. [
17] helps the user understand and mitigate the security risks that are associated with mobile apps and in particular, with Android-based apps. Their model (RISKMON) computes a risk score baseline that is derived from the runtime behaviour of trusted apps and user expectations. The risk score baseline results are used to evaluate actual app behavior and generate a cumulative risk sore. RISKMON increases an app’s baseline risk score every time an app attempts to access a sensitive or critical device resource. RISKMON considers permission-protected resources assuming that user assets are only reachable through the protection of permissions. However, the model reinforces resource protection by automatic permission revocation that does not require user consent; this may affect the effectiveness and efficiency of the MD user activities when using some of the services requested.
A comprehensive three layer framework for assessing the risk posed by mobile apps was proposed by Li et al. [
18]. Using a Bayesian graphical model, the system conducts static, dynamic and behavioral analyses to assess the risk the app introduces to the mobile environment. The framework provides the user with information about apps that have lower risk profiles. However, the risk assessment is completed only after the app has been executed and the results of the analyses at the three layers have been combined to achieve the final app risk score. Similarly, the models proposed by Kim et al. [
5,
19] consider the app’s actual behaviour but the app needs to be executed to enable risk assessment. Such an approach leaves the MD dangerously vulnerable to possibilities of compromise.
Baek et al. [
20] proposed to measure the potential security event (e.g., financial loss or loss of private data) frequency as an indicator of the riskiness of an app. They used a set of known benign and malicious apps and applied an unsupervised learning approach to create an app risk map. The MD user can make a decision about using a particular app based on the plotted frequencies on the risk map. The model does not include preventative action when a risk is identified. A comprehensive approach to using information about the app from several sources is undertaken in the work of Kong et al. [
21]. However, the proposed model also relies on user judgment when determining the risk posed by an app.
More recent research investigates how to increase accuracy of the prediction. For example, both Urooj et al. [
22] and Boukhamla & Verma [
23] propose ensemble machine learning (ML) models that work with a wide range of static features (including app permissions and intents). Panigrahi et al. [
24] improve the feature selection process by adopting a high performing nature inspired approach to select the most suitable static features for the ML classification model (HyDroid). The model named DroidDetectMW proposed by Taher et al. [
25] uses both static and dynamic features to classify apps (benign /malicious), and utilizes multi-class classification to determine the category a malicious app belongs to. However, the effective implementation of these complex proposed solutions may be hampered by the limitations of the computational environment of the MD.
The research reviewed has recognized the importance of assessing the risk to the MD posed by mobile apps residing on the device. Three major challenges to accurate app risk evaluation can be highlighted: (i) How to reduce or eliminate dependency on the inherently unreliable user input ? (ii) How to bypass the need to execute an app that may be malicious in order to establish its riskiness:, and (iii) How to increase the accuracy of the risk evaluation so that apps are not falsely categorized as risky?
In this research, we develop and evaluate a framework for app risk assessment that addresses these challenges. In addition to using app permissions as important app characteristics, the framework considers app intents in order to capture data about app-to-app communication behaviour. The framework includes an ensemble ML classification model and a probabilistic app risk assessment evaluator. It does not require user input or running the app that is being evaluated. Rather, an app is assigned a risk category based on the app’s classification (benign ort malicious) by the ML classifier and the app’s probabilistically estimated riskiness. Using a the two-prong approach mitigates the risk of falsely classifying an app as malicious.
The rest of the paper is organized as follows:
Section 2 provides a description of the Android OS security mechanisms used in this research and the methods involved in data collection and analysis. The proposed risk assessment framework and evaluation results are presented and discussed in sections 3 and 4. Directions for further research are also outlined.