Thursday, December 29, 2011

Binary Auditing part 2, Identifying For Loops

In this exercise, we are concentraing on identifying For Loops. In assembly, they can look considerably different than while and do loops.

We start at the address 00401004, where the author of the program is setting up a variable to count how many times we've iterated through the loop:

move [ebp+var_4], 0

The next instruction is a jump to 00401016, where we compare the value of what's in var_4 with 0Ah (which is 10, written in hexidecimal and denoted as such with the 'h'). If that comparision is true, it will jump to 0040102C; however, the value of var_4 is current 0, so the evaluation returns false and we move on to the next instructions:

push offset Format
call ds:printf

The two above commands simply push what's at the value of Format (which is the text "The for loop\n") and then call printf. For a referesher on this, take a look at the previous exercise.

The program then adds 4 to the current stack pointer (esp) and performs a jump to location 0040100D, which is technically the end of the for loop. At this location, the program is keeping track of how many times we've iterated through the loop by adding 1 to the counter, which is var_4:

mov eax, [ebp+var_4]
add eax, 1
mov [ebp+var_4], eax

In the next instruction, we return to the beginning of the for loop where it is comparing if var_4 (our loop counter variable) is equal to 10. At this point in the tutorial, it's equal to 1, so the for loop continues.


In C, this program looks like:

for (i = 0; i < 10; ++i)
  printf("The for loop\n");
return 0;

Wednesday, December 28, 2011

Binary-auditing.com tutorials, part 1

Take a look at http://www.binary-auditing.com/ . They've got some kick ass exercises, it just lacks walk throughs at the moment. We started going through them, documenting what we thought they were trying to teach. We're not experts by any means, so feel free to make comments with any corrections. Also, if something doesn't make sense, I want to make these tutorials better. So ask questions.

Their exercises say to start with the contents in the folder "003.02 - hll mapping". This post is about "assessment A01 - easy - Identifying while-do Loops .exe":


We see the program starts by setting up the stack, as previously covered in "binary-auditing-beginners-guide.pdf".

We also see var_4 being declared as a dword, and being the only variable declared.

In this tutorial, we're going to introduce using the program's address space to identify lines, instead of using line 1, line 2, etc. On the left side in IDA, you
see a hexidecimal (base16) value after ".text:". The .text is part of a normal PE File. (Here is a good timefor you to research other sections of a standard PE File). After
IDA tells us the section of the PE File we're in, the hexidecimal address is a way of keeping track of where in the program you or the computer is. As covered earlier,
the "int __cdecl main..." line is the declaration of the program. To the left, you will see the base address of the program (00401000, which can also be written as
0x401000 or 401000h). The first real instruction occurs here, and that is "push ebp".

Looking at the instruction that occurs at 0x401004 (mov [ebp+var_4], 0), this is where the programmer started. Here, the programmer declares he wants to move the value of 0
into the address space occupied by var_4.

eax = 0
ecx = 0

ECX is commonly used as the "counting register". Most often you'll see it used in loops, such as below.

At 0040100B, you see IDA declares this as loc_40100B. IDA will set this up when it clearly sees a jmp is occuring. mov eax, [ebp+var_4] is the first instruction in the loop,
therefore it shares the loc's address. If you recall earlier, var_4 was initialized as 0. Therefore, the next two lines set eax and ecx to 0:

mov eax, [ebp+var_4]
mov ecx, [ebp+var_4]

ecx is then added to with 1 and that value is put into the memory space of var_4:

add ecx, 1
mov [ebp+var_4], ecx

At 00401017, we see the comparision being made. This is essential to all loops, even infinite loops.

cmp eax, 0Ah

0A is hex, and the 'h' stands for hex. 0A is equivilient to 10 in base10. In the following line, we see that if eax is greater than 10, we are to jump to the next location
(40102C).

At 0040101C, we have 'push offset Format'. Format is a char located at 004020DC, in the PE File's rdata section. Format is pushed onto the stack, then we call the function
printf. By pushing the data contained in Format onto the stack before calling printf, we are passing this as an argument to printf. The printf function only takes one call,
so it will only take in the last data on the stack. (example: http://www.cplusplus.com/reference/clibrary/cstdio/printf/)

** Exercise ** Find out more about the PE File and it's sections. (A good reference: http://msdn.microsoft.com/en-us/magazine/cc301805.aspx)


while var_4 < 10 do
  var_4++
  printf("The while loop")
end


People who are contributing to these tutorials: @jackcr, @vulp1n3, @vilkas64, @jsherenco, Brad Nottle, @bl4ck_0ut and myself (@jbc22).