Automatic Network Protocol Behaviour Analysis for Android Applications using WEKA

Competitive Advantage Report for Prudential U&E
May 25, 2019
May 25, 2019

Automatic Network Protocol Behaviour Analysis for Android Applications using WEKA

This is a student submitted essay. This is not an example of the work completed by our expert academics.

Automatic Network Protocol Behaviour Analysis for Android Applications using WEKA

Academic Subject: Protocol Behaviour Analysis

Word Count: 3000

Submitted by: Student

1.1 Introduction

According to (Kim et al., 2016a), Android applications are a very vital class of 21st century World Wide Web programs that produce an average 60 percent of cell-phone internet systems and application transmission. Over 2 million Android applications are available in Google, Apple store and Microsoft open markets, and millions of new mobile applications are added each month. Nevertheless, very scant details are well-known about Android program protocol behaviours since they mainly apply exclusive rules on top of HTTPs. The whole issue is additionally worsened by the common application of general data information like JSONs and XMLs. Because of this, examining android programs protocol behaviours needs an in-depth description of applications level payload for every specific program (Wang et al., 2015).

Holland et al. (2015) provided understanding android program behaviour in a network is vital in giving value added services, like program hastening and active storing. This has been interestingly important for several programs which include trying protocol operationalization, evaluating malwares, and rerunning program discourses. Nonetheless, protocol behaviour assessment is thoroughly manual, demanding great amount of human energy in reverse engineering. On Android applications, the wide spread applications of and default support for code clouding tools, like ProGuard, complicates the whole process making it harder(Karim et al., 2016).

According to Hu et al. (2015), the tireless World Wide Web connection of Android Mobile phones together with their universal application and the need of end-users to attempt new mobile programs make them a significant manipulation targets. Malwares may effortlessly pose as harmless, must have programs, whereas they may simply cause damages stretching from losing communication details to bolting of the smart phone from free use. For example, Android mobile malwares have been employed to steal and alter personal information. Furthermore, recent studies show a rising threat of malwares for Android mobile environments even in the existence of anti-malware checks. Android Mobile Bouncer has been used in Google Play markets to detect malevolent programs. But, it appears that the hackers have invented mechanisms on how to avoid exposures. The worst of all, we have other third-parties that offer Android programs for downloading that may simply provide malicious programs (Flegel et al., 2013).

Another problem is that not every program discloses their malicious behaviour when end-users install them or execute them on the android mobile phone. Instead, the malicious behaviour may be activated depending on dissimilar circumstances (Szabó et al., 2011). For example, a set of malware may be inactive pending activation by a certain event in the android system. Certain actions or events are autonomous of end-user communications with the program; presence of network, however other events are anchored on end-user inputs. Nonetheless, a number of software testing is done in order to check the features of programs or applications. But lately Android applications safety testing is gaining acceptance and has demonstrated its significance. Nevertheless, testing android mobile programs is never a straight forward undertaking because of the diversity of data entered and heterogeneity of the technologies (Xiaojuan et al., 2010).

In this research, the researcher will present the first broad protocol analysis model or framework that spontaneously mines protocol behaviours, organizations and communication signatures by leveraging binary analysis. Utilizing binary analysis for Android application protocol analysis, nonetheless, needs resolving two distinctive encounters that have not been addressed previously. First, while application binary is openly accessible in the open market, only the client application is obtainable(Karim et al., 2016). Characteristically the server binary, protocol documentation, and the source-code are not accessible in the open market. Thus, the researcher needs to exclusively depend on the client application, not like other methods that utilize both the server and client binary or even the source-code. In addition, it is supposed to give high analysis and accurately understand the communication organization and association between communications(Xiaojuan et al., 2010). This is vital because protocol communications frequently contain dependencies such as authentication token to be applied for successive requests can be entrenched in a preceding login rejoinder, or a URI such as image ID can be fixed in a previous response (Hu et al., 2015).

Thus, the researcher will present WEKA Extractocol system that will provide spontaneous and complete analysis of application protocol behaviours(Kim et al., 2016a). WEKA Extractocol simply utilizes Android application binary as the source-code and perfectly rebuilds Hyper Text Transfer Protocol transactions and classifies their communication organization and associations using binary analysis(Karim et al., 2016). Particularly, WEKA Extractocol excerpts parts of application source codes such as program slices that either produce requests or parse response communications. It then rebuilds internally the dependencies between these items. Lastly, it uses a cautious semantic analysis to excerpt communication arrangements and signatures from the target program(Hu et al., 2015).

1.2 Aims and Objectives

1.2.1 General Aim and Objective

The main aim of this research project is to extract the network application layer communication behaviour that is done by the HTTPs and give a complete scrutiny of every individual program, instead of large scale examination.

1.2.2 Specific objectives

  • To analyze network protocol behaviour in android programs
  • To classify parameters used to detect android application behaviors
  • To detect the android application behaviour using parameters

1.3 Research Questions

  • What are the network protocol behaviors of various android applications?
  • What are the parameters used to detect network protocol behaviors of various android applications?
  • How can the network protocol behaviors of various android applications be detected using parameters?

Chapter 2: Literature Review

2.1 Introduction

In this chapter the researcher will summarize related literature of what other authors have discussed on automatic network protocol behaviour analysis for android applications using WEKA.  This research chapter is divided into four sections: section 2.1 Android application security studies, section 2.2 WEKA Analysis anchored run-time and the behaviour of Android applications, 2.3 the researcher will discuss WEKA execution on Android application source codes, and limitations.

2.2 Android application security studies

Android mobile applications have seen a stable acceptance since its invention in 2008 by the manufacturing industries, mobile end-users, and the application developers. In 2016 there were had more than 2 billion monthly Android application end-users and Google play store posted over 2 million android applications. This acceptance is additionally appreciated at the expense of other mobile applications, since Android programs take about 85 percent of the mobile phones sales globally. Because of this end-user shift in the purchase of Android powered phones, android mobile applications are the target of network attacks(Zhang and Yin, 2016).

2.3 WEKA Analysis

Comprehending the application behaviour has important consequences in networks. Besides its internal value, it permits networks to give fresh applications-aware services, normally in combination with other network technologies(Cagica, 2017). In this research, the researcher will discuss how WEKA analysis works. It first allows applications-aware treatment in networks. The use of request certificates or signatures and adequately multiplied message dependences, analyzers may logically pre-fetch information, which is among the key structural blocks for program acceleration (Khan, 2015).

Currently, the programming of dynamic caching substitutions is executed physically on a per application basis since it needs the understanding of program semantics such as which request factor is animatedly produced to determine which information is storable.

Figure 2.1: Application acceleration, Source: Jeong-Min, 2016

2.4 Design

To realize the final objective, WEKA Extractocol will carry out three main roles; first, it produces application slices that capture network communications. Second, it integrates semantics examination with data dependency evaluation to rebuild communication setups and names. Finally, it detects fine grained depend encesamongst protocol communications by ascertaining inter-slice dependences. Figure 2.1 demonstrates the three modules (Wang et al., 2013).

Figure 2.2: Overview of WEKA Extractocol, Source: Jeong-Min, 2016

2.4.1 Networkware application slicing

A representative Android comprises of several line of instructions than just handling of protocol. Therefore, WEKA Extractocol pre-processes the Android.apk to excerpt application slices only linked to protocol execution (Wang et al., 2013). The objective of this stage is to produce application portions that produce HTTP demands and give replies. WEKA Extractocol excerpts every application slices that includes the items that either move out to or come in from the commination networks. The researcher named the out bound data flow as request slice since it arrests the source code and items for building requests, and the inbound flow as reply slice since it arrests the source code and items utilized for dispensing a reply. To get these portions, WEKA Extractocol will use novel bi-directional taint analysis (Xiaojuan et al., 2010).

2.4.2 Signature extractions

The next section takes request and response portions as input and produces setups and names for both. Because that application portion arrests all items and processes that produce requests or execute a reply, it encrypts all essential data to excerpt their names. For signature mining, WEKA Extractocol executes assessment using semantic replicas for frequently used Android and Java APIs for HTTP dispensation. It then produces the demand technique and names for demand URIs and request or response captions and bodies(Zhang and Yin, 2016).

(Please drop us an email if you would like to review the completed chapter…)

Chapter 3: Methodology and Design

3.1 Introduction

This chapter will discuss the methodology that the researcher will use to automatically detect and analyse network protocol behaviour for android applications using WEKA Extractocol. The researcher will carry out deep analysis of isolated information that the systems infringe the guidelines and transmit them over Internet or through wireless network card and then categorise them to two classifiers. In sub-chapter 3.1 the researcher will discuss the WEKA Extractocol environment that will be used automatically detect and analyse network protocol behaviour for android applications using WEKA Extractocol. In sub-chapter 3.2 the researcher will demonstrate how dataset of taints will be collected and gathered. In sub-chapter 3.3 the researcher will discuss how certain amount of data will be manipulated as feature extraction and sub-chapter 3.4 the researcher will discuss the behaviour based analysis module. In sib-chapter 3.5 the researcher will briefly describe some machine learning classifiers which researcher will attempt to use on the dataset. Finally, in sub-chapter 3.6 the chapter the designed application for the problem as algorithm and pseudo-code will be implemented (Wei et al., 2012).

3.2 Database (Dataset)

The dataset that will be used by the researcher will comprise of different types of taints to a maximum of 50 trials of android applications (Wei et al., 2012). Accordingly 50 percent of these will be those normal (benign) and the remaining will be infected (malware). For the normal (benign) programs, the researcher will collect them via the official Android website and other existing android markets that use smart phones that run WEKA 3.8.1 Android operating system (Wei et al., 2012).

Figure 3.1 below shows the technique the researcher will use to design training and analysing datasets as program source, the categorised programs have been tested by authentication system (Kim et al., 2016).


(Please drop us an email if you would like to review the completed chapter…)

Chapter 4: Primary Research

4.1 Introduction

Primary research is the data gathered entirely for research purposes. It includes the gathering of original information to answer the study questions (Somarriba et al., 2016). Primary research may take different forms like interviews, surveys or focus groups. The most important thing in primary research is the production of reliable and consistent data. It is similarly vital to carry out study in an ethical manner (Sufatrio et al., 2015).

Chapter four concentrates on experiments carried out, evaluations and outputs and evaluation that have been done in this research. In sub-chapter 4.2 the researcher discusses how the experiments on android applications behaviour at execution time, studies its behaviour, gathers data and builds the datasets (Holland et al., 2015). In sub-chapter 4.3 the researcher illustrates how learning data was obtained (Gascon et al., 2013). Sub-chapter 4.4 discusses program placement and the design of tool known as WEKA Extractocol to gather the features from the log. Sub-chapter 4.5 log parsing methodology is discussed, and then an algorithm tests being made to select the best classifier at section 4.5 and finally classification results and discussion are presented in section 4.6 (Gascon et al., 2013).

The researcher performed in depth case studies utilizing a number of android based programs and showed that WEKA Extractocol has capacity to provide detailed analysis on each individual application (Gascon et al., 2013). Our primary research answers three key research questions:

  1. What are the network protocol behaviours of various android applications?
  2. What are the parameters used to detect network protocol behaviors of various android applications?
  • How can the network protocol behaviours of various android applications be detected using parameters?

4.2 Validation of protocol analysis

Network protocol behaviour analysis needs high coverage by pinpointing several transmissions (Suarez-Tangil et al., 2014). Similarly the signatures need to be rationally equal to the functions which are encrypted in the target program and produce an effective combination on real network traces. The researcher tested these measures (Suarez-Tangil et al., 2014).

4.2.2 Dataset

The researcher used a number application from open source application source where a number of them are accessed from Google Play and others are commercial application with over a million of them downloaded from five groups in Google Play (Patel and Buddadev, 2015). On the open source application, the researcher accessed ground truth by cautiously reviewing the source codes. The researcher collected traffic traces for every application HTTPs communications using manual UI-fuzzing that needs to be done manually (Patel and Buddadev, 2015). In capturing and decrypting HTTPs communications, the researcher used man-in-the-middle proxies. The researcher obfuscated open source applications using ProGuard and verified that the similar output hold as non-obfuscated android applications (Patel and Buddadev, 2015).

(Please drop us an email if you would like to review the completed chapter…)


  • Almin, S.B., Chatterjee, M., 2015. A Novel Approach to Detect Android Malware. Procedia Comput. Sci., International Conference on Advanced Computing Technologies and Applications (ICACTA) 45, 407–417. doi:10.1016/j.procs.2015.03.170
  • Antunes, J., Neves, N., 2011. Automatically Complementing Protocol Specifications from Network Traces, in: Proceedings of the 13th European Workshop on Dependable Computing, EWDC ’11. ACM, New York, NY, USA, pp. 87–92. doi:10.1145/1978582.1978601
  • Dixon, B., Mishra, S., 2010. On rootkit and malware detection in smartphones, in: 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W). Presented at the 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 162–163. doi:10.1109/DSNW.2010.5542600
  • Faruki, P., Bharmal, A., Laxmi, V., Ganmoor, V., Gaur, M.S., Conti, M., Rajarajan, M., 2015. Android Security: A Survey of Issues, Malware Penetration, and Defenses. IEEE Commun. Surv. Tutor. 17, 998–1022. doi:10.1109/COMST.2014.2386139
  • nce on Computational Intelligence and Security. Presented at the 2010 International Conference on Computational Intelligence and Security, pp. 329–333. doi:10.1109/CIS.2010.77
  • Somarriba, O., Zurutuza, U., Uribeetxeberria, R., Delosi&#xe8, Res, L., Nadjm-Tehrani, S., 2016. Detection and Visualization of Android Malware Behavior [WWW Document]. J. Electr. Comput. Eng. doi:10.1155/2016/8034967
  • Suarez-Tangil, G., Tapiador, J.E., Peris-Lopez, P., Ribagorda, A., 2014. Evolution, Detection and Analysis of Malware for Smart Devices. IEEE Commun. Surv. Tutor. 16, 961–987. doi:10.1109/SURV.2013.101613.00077
  • Sufatrio, Tan, D.J.J., Chua, T.-W., Thing, V.L.L., 2015. Securing Android: A Survey, Taxonomy, and Challenges. ACM Comput Surv 47, 58:1–58:45. doi:10.1145/2733306
  • Tam, K., Feizollah, A., Anuar, N.B., Salleh, R., Cavallaro, L., 2017. The Evolution of Android Malware and Android Analysis Techniques. ACM Comput Surv 49, 76:1–76:41. doi:10.1145/3017427
  • Wang, J., Xue, Y., Liu, Y., Tan, T.H., 2015. JSDC: A Hybrid Approach for JavaScript Malware Detection and Classification, in: Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security, ASIA CCS ’15. ACM, New York, NY, USA, pp. 109–120. doi:10.1145/2714576.2714620
  • Wei, T.E., Mao, C.H., Jeng, A.B., Lee, H.M., Wang, H.T., Wu, D.J., 2012. Android Malware Detection via a Latent Network Behavior Analysis, in: 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications. Presented at the 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications, pp. 1251–1258. doi:10.1109/TrustCom.2012.91
  • Xueming, L., 2013. Access Control Research Based on Trusted Computing Android Smartphone, in: 2013 Third International Conference on Intelligent System Design and Engineering Applications. Presented at the 2013 Third International Conference on Intelligent System Design and Engineering Applications, pp. 213–215. doi:10.1109/ISDEA.2012.56
  • Cagica, C., Luisa, 2017. Handbook of Research on Entrepreneurial Development and Innovation Within Smart Cities. IGI Global.
  • Chen, X., Zhu, S., 2015. DroidJust: Automated Functionality-aware Privacy Leakage Analysis for Android Applications, in: Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks, WiSec ’15. ACM, New York, NY, USA, p. 5:1–5:12. doi:10.1145/2766498.2766507

(Please drop us an email if you would like to review the complete list of references…)


If you are the owner of this essay and wish to have it removed from our website then contact us at [email protected]. In your email, copy and paste the url link of the page/work you no longer want to have published on Research Prospect Ltd.

Leave a Reply

Your email address will not be published. Required fields are marked *

Open chat
Discounts & Offers