CS8395 — Homework 1

Virtual Machines, Stack Smashing, and Metasploit

This assignment is due Wednesday, February 7 at 11:59PM Central time.

Introduction

This class focuses on the discussion of cybersecurity research. This homework introduces foundational tools and concepts relevant to cybersecurity:

While you should read this spec in its entirety, you can find the specific items to turn in by searching for "Turn-in".

Turn-in for HW1

Read this assignment document and complete the tasks described. The final deliverable is a zip file containing your exploit.py, badfile, and badfile-shell files, as well as a written PDF report. There is no length requirement or limit for the report, but I expect this will take 3 pages or fewer (depending on the size of your screenshots where appropriate).

Ensure that your name, VUNetID, and email address appear at the top of each of page.

Ensure that you have three sections to your submission labeled: Task 1, Task 2, and Task 3.

I strongly recommend the use of LaTeX to complete this (and all) assignments. LaTeX is the lingua franca for scientific communication in computer science — every peer-reviewed publication I have submitted and reviewed has been written in LaTeX. You can use overleaf.com to help you write LaTeX documents. I have also created a template that you can copy, available here.

Task 1: Virtual Machines and Linux Fundamentals

A Virtual Machine is an emulation of a computer system. Loosely, you can think of a VM as a program that can run a whole virtual computer system. Virtual machines are powerful software systems that enable running software for one operating system inside another.

For example, you can use your Windows host computer to run a Virtual Machine that contains a Linux operating system. Consider the image below:

This is a Windows 10 host computer running three different Virtual Machine guests. The guest instances are complete (virtual) environments that are isolated from the host. All of the guests share the host's hardware as they execute — each window in the screenshot above lets you interact with a separate emulated guest.

Thus, even though the host is a Windows computer, you can use one of the guests to execute Linux software inside the guest. Virtual Machines can be used in many combinations. You can have a Windows, Linux, or Mac host computer, and run arbitrary numbers and combinations of Linux and Windows guests. Finally, guests are stored as files in the host computer — this means you can move your VM guest from one host to another by transferring the file around.

Virtual Machines are a critical part of computer security research and practice. VM guest instances allow analysts to safely execute certain malicious code without damaging underlying system software or data. Moreover, VMs can be instrumented to analyze execution of malware samples to measure what damage the sample does. Note that the use of VMs in this manner is often referred to as sandboxing.

Kali Linux

VMs are useful in a variety of contexts for myriad purposes. For example, in cloud computing, a provider like Amazon or Microsoft can create and lease VM guests to paying customers. As a more specific example, a malware analysis pipeline may involve creating multiple VM environments in which to run and examine many thousands of malware samples in sequence.

However, VMs are also very useful for Penetration Testing ("pentesting," "ethical hacking," "white hat hacking"). A large variety of tools are available for pentesting, including debuggers (e.g., gdb), disassemblers (e.g., Ghidra), network monitoring (e.g., Wireshark), intrusion detection (e.g., Fireeye), and more. Because Penetration Testing is such an important part of cybersecurity, many of these tools have been packaged together by an organization called Offensive Security called Kali Linux.

Kali Linux out-of-the-box includes a set of tools used in pentesting. It is an invaluable asset in starting with cybersecurity research and practice. For example, Kali contains a dictionary of commonly-used passwords for conducting dictionary attacks. You can find this dictionary under /usr/share/wordlists/rockyou.txt.gz. We will use this to demonstrate some basic shell scripting.

Warning: If you are using an ARM-based host like Apple M1 or M2, you may want to consider the following alternatives:

Turn-in for Task 1: VM Setup and Linux Fundamentals

Your first task is to set up a Virtual Machine that runs Kali Linux. There are several options, including VirtualBox, VMWare, and QEMU. VirtualBox and QEMU are both freely available, and VMWare is a pro solution that may require a paid commercial license. For this assignment, I strongly recommend VirtualBox, but you are welcome to use any platform.

Once you have installed a Virtual Machine platform, use it to create a VM guest and install Kali Linux on it.

  • Document that you have downloaded and installed an appropriate Virtual Machine platform with a Kali Linux guest running. A screenshot will suffice.
  • Once your VM is set up and running, use it to search the password dictionary for all passwords containing a contiguous sequence of 3 pairs of letters but that does not end in a number (e.g., "bookkeeper" counts because it contains oo, kk, ee in sequence, but "bookkeeper1" would not count). Sort the list alphabetically, and return the first 10 such passwords. Include them in your writeup.

    As a hint, you can complete this in a single line on the terminal. Consider using zcat, egrep, sort, and head utilities, as well as pipes in the terminal.

Task 2: Smashing the Stack

For this task, you will work with a small C program that contains a stack overflow vulnerability. You will craft a malicious input that exploits the vulnerability to execute a payload that you will build. This consists of several steps:

After you complete this task, Task 3 will go through the use of Metasploit to automate the generation of exploits and payloads.

Runtime Organization Primer

On almost every modern computer system, programs are provided a common runtime environment consisting of a stack and a heap (and regions for executable code, libraries, file handles, etc.). On modern computers, when a user runs a program, the operating system allocates pages of memory for that program to use (e.g., to store variables). The stack is used to by the program to pass parameters to functions and to allocate space for local variables (e.g., locally-scoped variables). The stack grows and shrinks as the program calls and returns from functions.

Runtime Stack and Activation Records

Whenever a program needs to call a function, it follows a calling convention, which is set of rules the program must follow to ensure it can properly interface with the target function. This is especially important when importing library functions written by others — if your program does not follow a convention used by existing library code, the program cannot properly use the library's functions. Calling conventions are an important part of system integration, and often depend on the operating system, compiler, optimization levels, and architecture.

While many calling conventions exist, one of the most commonly-used is the C Calling Convention (sometimes called cdecl. In cdecl, a function call consists of several steps:

For an illustration, see the figure below.

Runtime stack overview

Stack Overflow Vulnerabilities

The runtime organization inherently requires mixing control flow with datathe address of the instruction to execute after a function returns is saved on the stack (i.e., the return address). If the stack is controllable by a malicious user, they can smash the stack by overwriting the return address with a carefully-controlled address of their choosing.

Stack smashing vulnerabilities typically emerge when a program accepts user input (e.g., through a function like gets, which retrieves a string from the command line input). A program may allocate a fixed-size buffer in which to store the user's input. If the program does not carefully check or restrict the size of the input provided by the user, the buffer may be too small to store all the input. In the runtime environment described above, an input that is too large will extend beyond the end of the buffer, overwriting (or smashing) the return address saved on the stack.

Example vulnerable code

Hijacking control

A key idea we have discussed so far is that an attacker can overwrite a stored return address, causing the program to begin executing instructions at a location specified by the attacker. Consider what that means in the example vulnerable code above: an attacker can place instructions on the stack, then overwrite the return address with the address of the stack itself, so that the attacker can cause the program to execute instructions they specify! A successful stack smashing attack results in the attacker executing instructions they provide.

Notes and setup for Task 2

This task contains a package of starter code that you can download. Inside your Kali VM, download the following file: kjl.name/cs8395/hw1-baked.zip.

Turn-in for Task 2: Stack overflow proof of concept
  • As noted in the readme file, you must include a copy of your functioning exploit.py and badfile files in your final zip submission.
  • In your written report, describe your approach to determining the correct values that yielded a successful stack smashing attack. Include a screenshot of a successful run of the vulnerable program with your payload.
  • Note that exploit.py starts by creating a buffer of 512 bytes filled with value 0x90. What does 0x90 mean? Why do we use it when constructing stack smashing attacks? In addition, explain how your approach would have to change if you did not have access to the invoke.sh script.

Task 3: Metasploit

The MetaSploit framework is a powerful suite of tools that can be used to automatically craft malicious payloads and exploits against a wide variety of existing software. For example, it is possible to use MetaSploit to output a malicious input that can exploit well-known software in a few commands. MetaSploit also contains a variety of useful payloads, including a TCP shell server — that is, you can use MetaSploit to create a payload that, when executed, creates a server that an attacker can connect to, enabling remote access to a victim computer. In Task 3, you will revise the payload you were provided to create a remotely-accessible server within the context of the vulnerable program.

Metasploit is included with Kali out of the box. In a terminal, just type msfconsole to launch the Metasploit console. It takes a bit to load, but once finished, you can type help to list out the various features it supports.

Metasploit payloads

Metaploit has a number of payloads built in, targeting various use cases (e.g., dropping to a shell to run commands, opening a socket server, creating a remote desktop server, injecting arbitrary programs, and more for 32- and 64-bit Linux and Windows systems). To see a list of the available payloads, type show payloads in the Metasploit console.

For Task 3, we will use the payload/linux/x86/shell_bind_tcp payload. This payload, when executed, causes the vulnerable program to create a TCP server that allows an attacker to connect to the victim's computer remotely and control it. Doing so allows an attacker to exfiltrate valuable data from the victim system (by viewing file contents that would normally be inaccessible). To work with this payload, type use payload/linux/x86/shell_bind_tcp into the Metasploit console.

While I encourage you to work in the x86 environment, if you decide to or need to use aarch64 (e.g., Mac M1 or M2 silicon, or other ARM-based CPU hosts), then you may consider other payloads supported by Metasploit, such as the payload/linux/aarch64/shell_reverse_tcp. (Thanks to Aadi Bajpai for completing the assignment for aarch64).

Metasploit allows you to configure payloads for specific circumstances. For example, the shell_bind_tcp payload creates a shell server to which the attacker can connect. Under the hood, metasploit lets you configure the TCP port that it binds to (so that the attacker knows which port to connect to), and (optionally) lets you specify the remote host from which a connection should come so that only the attacker can remotely access the host. You can type show options to see which options are configurable for the payload you selected. For the shell_bind_tcp payload, you need only specify the LPORT, which is 4444 by default.

You generate the payload by typing generate LPORT=4444 (i.e., you include whatever options you need to set). The resulting output is a string of binary instructions that represent the raw payload that must be placed on the stack during the attack.

When this payload is executed by the vulnerable program, it opens a shell server that allows remote connection. You can connect to the shell that is created by using telnet localhost 4444 in a separate terminal in the Kali VM. In practice, attackers embed other properties like callbacks to help track which victim IP addresses are ready to connect to. For this assignment, you can contain everything within your Kali VM, so working with localhost or 127.0.0.1 is fine. Once connected via telnet, you can issue arbitrary commands. For this assignment, use cat /etc/passwd to display the accounts created on your Kali VM.

Turn-in for Task 3: Metasploit

Generate a shell_bind_tcp payload using Metasploit in your Kali VM.

  • In your report, include a copy of the payload output from Metasploit (a screenshot will suffice).
  • Adapt this payload to work with the bof.c program provided in Task 2. Demonstrate that you can start a shell by exploiting the stack overflow vulnerability. Attach a copy of the input file in your submitted zip called badfile-shell.
  • Provide a screenshot showing two terminals: one in which you run the vulnerable program and execute your payload, and another in which you connect to the shell that is created using telnet. As indicated above, connect to the shell and run cat /etc/passwd to show the accounts on your Kali VM.

What to turn in for HW1

you must submit a single .zip file called vunetid.zip. While you can work with others in the class conceptually, please submit your own copy of the assignment. Your zip file must contain:

Use the submission system (VU login required) to submit your zip file.