Take a look at
http://www.binary-auditing.com/ . They've got some kick ass exercises, it just lacks walk throughs at the moment. We started going through them, documenting what we thought they were trying to teach. We're not experts by any means, so feel free to make comments with any corrections. Also, if something doesn't make sense, I want to make these tutorials better. So ask questions.
Their exercises say to start with the contents in the folder "003.02 - hll mapping". This post is about "assessment A01 - easy - Identifying while-do Loops .exe":
We see the program starts by setting up the stack, as previously covered in "binary-auditing-beginners-guide.pdf".
We also see var_4 being declared as a dword, and being the only variable declared.
In this tutorial, we're going to introduce using the program's address space to identify lines, instead of using line 1, line 2, etc. On the left side in IDA, you
see a hexidecimal (base16) value after ".text:". The .text is part of a normal PE File. (Here is a good timefor you to research other sections of a standard PE File). After
IDA tells us the section of the PE File we're in, the hexidecimal address is a way of keeping track of where in the program you or the computer is. As covered earlier,
the "int __cdecl main..." line is the declaration of the program. To the left, you will see the base address of the program (00401000, which can also be written as
0x401000 or 401000h). The first real instruction occurs here, and that is "push ebp".
Looking at the instruction that occurs at 0x401004 (mov [ebp+var_4], 0), this is where the programmer started. Here, the programmer declares he wants to move the value of 0
into the address space occupied by var_4.
eax = 0
ecx = 0
ECX is commonly used as the "counting register". Most often you'll see it used in loops, such as below.
At 0040100B, you see IDA declares this as loc_40100B. IDA will set this up when it clearly sees a jmp is occuring. mov eax, [ebp+var_4] is the first instruction in the loop,
therefore it shares the loc's address. If you recall earlier, var_4 was initialized as 0. Therefore, the next two lines set eax and ecx to 0:
mov eax, [ebp+var_4]
mov ecx, [ebp+var_4]
ecx is then added to with 1 and that value is put into the memory space of var_4:
add ecx, 1
mov [ebp+var_4], ecx
At 00401017, we see the comparision being made. This is essential to all loops, even infinite loops.
cmp eax, 0Ah
0A is hex, and the 'h' stands for hex. 0A is equivilient to 10 in base10. In the following line, we see that if eax is greater than 10, we are to jump to the next location
(40102C).
At 0040101C, we have 'push offset Format'. Format is a char located at 004020DC, in the PE File's rdata section. Format is pushed onto the stack, then we call the function
printf. By pushing the data contained in Format onto the stack before calling printf, we are passing this as an argument to printf. The printf function only takes one call,
so it will only take in the last data on the stack. (example: http://www.cplusplus.com/reference/clibrary/cstdio/printf/)
** Exercise ** Find out more about the PE File and it's sections. (A good reference: http://msdn.microsoft.com/en-us/magazine/cc301805.aspx)
while var_4 < 10 do
var_4++
printf("The while loop")
end
People who are contributing to these tutorials: @jackcr, @vulp1n3, @vilkas64, @jsherenco, Brad Nottle, @bl4ck_0ut and myself (@jbc22).