Abstract—Automatic classification of virus samples into a concept hierarchy has been attracting much attention from malware research community. This would help anti-virus experts to have an obvious and systematic view on the landscape of virus samples, whose numbers have been rapidly increasing recently. However, it is not a trivial work, since malwares usually come in binary forms whose actions are complicated and obfuscated. Therefore, the typical data mining approaches based on feature extraction are not easily applied.
In this paper, we introduce an approach using Formal Concept Analysis (FCA) to generate a malware hierarchy. Since virus behaviours are often described effectively by temporal logic, we extend formal paradigm of FCA by using Logical Concept Analysis (LCA), where concepts are generalized by logic formulas. We also enhance the basic LCA to Viral Logical Concept Analysis (V-LCA), where abstraction techniques are used to abstract formal concepts representing virus samples. Our approach has been applied in a real dataset of virus and promising experiment results have been acquired.
Index Terms—Computer virus, malicious software, malware detection, formal concept analysis, logical concept analysis, viral logical concept analysis, conceptual clustering.
B. T. Nguyen, T. T. Quan, and H. M. Nguyen are with Ho Chi Minh City University of Technology, Vietnam (e-mail: email@example.com, firstname.lastname@example.org, email@example.com).
D. C. Tran is with Dong Nai University, Vietnam (e-mail: firstname.lastname@example.org).
Cite: Nguyen Thien Binh, Tran Cong Doi, Quan Thanh Tho, and Nguyen Minh Hai, "Viral Logical Concept Analysis for Malware Conceptual Hierarchy Generation," International Journal of Machine Learning and Computing vol. 7, no. 4, pp. 49-54, 2017.