If you look back at the code that I used you will notice that it is very clean with minimal interpretation. By interpretation I mean between the syntax of AT&T and Intel assembler. This is where I meant I would explain this later when I mentioned that things looked odd and why I flipped the operands. I am by far an expert when it comes to explaining the differences between the syntax of the two and have worked with and know several people that could explain this better but I will give it my hand.
One of the things that you will be able to point out whether or not the code is AT&T or Intel is that registers are prefixed with '%' and '$'. There are many , many other things that you need to understand the difference in how the translate and here are a few.
Source and destination for move and store
at&t = src, dest | intel = dest, src
designation of code
at&t= .code32 or .code16 | intel = BITS16 or BITS32
Comments
at&t = /* */ | intel = ;
appended instructions
at&t = pushb | intel = push byte
There are many other differences such as directive syntax, far jumps etc. All the differences are way beyond this blog and there is extensive documents around that I have listed in the references that you can see.
Now the previous differences explain why I did the flip also why I chose the code. If you look back there is little that you need to flip in fact only 13 times which is pretty low.
From the previous post I mentioned the one of the reasons for post was to show some of the issues that you may come across. Many, Many tutorials on assembly mention the use of "nasm" . As you can see you will come across some issues if that is all that you are familiar with. Now let us take some additional steps. So working the same example http://www.shell-storm.org/shellcode/files/shellcode-515.php . We now know that this is AT&T syntax, but what do we do with this? This is were the GNU Assembler (GAS) steps in. This assembler understands both Intel and AT&T syntax so lets just compile the code direct.
First we will remove the "<>" since these are not part of the assembly and remove all the other code:
# cat test.asm | cut -d '/' -f3 | sed 's/%//g' | sed 's/\$//g' | sed 's/<//g' | sed 's/>//g' > 515.sNow that we have the code into pure AT&T syntax lets compile with "as" and get it working:
# as 515.s -o 515.o
# ld -o 515 515.o
#./515
tcp 0 0 0.0.0.0:64713Ok great, now we now how to compile AT&T syntax code and get a binary that we can execute. So many people are saying but I really , really , like Intel syntax how can I see the Intel syntax. With a compiled binary this is simple with gdb or objdump.
With gdb you can load it into the debugger:
Load it up into the debugger:
# gdb -q ./515
set the flavor that you want displayed :
(gdb) set disassembly-flavor intel
run then kill it and disas:
(gdb) run
Starting program: /root/blog/gas
^C
Program received signal SIGINT, Interrupt.
0x08048088 in _start ()
(gdb) disas
Dump of assembler code for function _start:
0x08048054 <+0>: push 0x66
0x08048056 <+2>: pop eax
0x08048057 <+3>: xor ebx,ebx
0x08048059 <+5>: push ebx
0x0804805a <+6>: inc ebx
0x0804805b <+7>: push ebx
0x0804805c <+8>: push 0x2
0x0804805e <+10>: mov ecx,esp
.
.
The second method that you can use which is much easier is using the "-M intel" switch of "objdump":
# objdump -d -M intel ./gas
08048054 <_start>:
8048054: 6a 66 push 0x66
8048056: 58 pop eax
8048057: 31 db xor ebx,ebx
8048059: 53 push ebx
804805a: 43 inc ebx
804805b: 53 push ebx
Great now we have 2 ways to display the Intel syntax of a AT&T compiled binary. So we have the final piece that I wanted to talk about. Code conversion without a compiled binary.
During my interwebz travels I have come across an old sed script that does a minor translation from Intel to AT&T . Now this doesn't help us entirely but it does bring what I like which is visibility. A lot of admins understand sed but maybe not understand C or python or perl so I decided to do a translation in reverse with the same script.
You can find the original sed script that will do that Intel to ATT here http://www.delorie.com/djgpp/mail-archives/djgpp/1995/06/06/05:48:34 . Below you will find a slimmed down sed script that will do the basics of what the previous sed script does just in reverse. Also this is not the most optimal method but I wanted to do a 1-1 comparison just for visibility. You can download it here as well att2intel.sed
#
# @(#)as386.sed 1.1 - 86/11/17\
# ^------This is from original----^
#
# rev: 03022012
#
# A typical way to use this sed script is:\
#
# tr "[A-Z]" "[a-z]" <infile | sed -f this-script >outfile\
#
##################################################
# spread everything out
##################################################
:cmpress
h
# this removes any comments in the file
s/;.*$//
# this squished some lines down
s/[ \t]*$/\t/
# add leading space on all lines
s/[ \t][ \t]*/\t/g
# this seperates lines with , : with space
s/\([][)(,:*/+-]\)/\t\1\t/g
s/[\t]?[\t]/\t0\t/g
s/?/_/g
s/[\t][\t]*/\t/g
:equate
s/^\([^\t]*\)[\t]equ[\t]\(.*\)[\t]$/#define\t:\t\1\t\2\t/
####################################################
# This is where I remove % and $ from the symbols
####################################################
:registr
s/[ \t]%e\([abcd]\)x[ \t]/\te\1x\t/g
s/[ \t]%\([abcd]\)\([hlx]\)[ \t]/\t\1\2\t/g
s/[ \t]%e\([ds]\)i[ \t]/\te\1i\t/g
s/[ \t]%\([ds]\)i[ \t]/\t\1i\t/g
s/[ \t]%e\([bs]\)p[ \t]/\te\1p\t/g
s/[ \t]%\([bs]\)p[ \t]/\t\1p\t/g
s/[ \t]%\([cdefgs]\)s[ \t]/\t\1s\t/g
s/\$//g
s/%//g
s/<//g
s/>//g
####################################################
# word ptrs
####################################################
s/[ \t]\([abcd]\)\([hlx]\)[ \t],[ \t]1x\([^ ]*\)[ \t]([ \t]\([^ ]*\)[ \t])/\tbyte\t[\t\4\t+\t0x\3\t]\t,\t\1\2/g
s/[ \t]e\([abs]\)\([xip]\)[ \t],[ \t]0x\([^ ]*\)[ \t]([ \t]\([^ ]*\)[ \t])/\tdword\t[\t\4\t+\t0x\3\t]\t,\te\1\2/g
s/[ \t]0x\([^ ]*\)[\t]([ \t]e\([abs]\)\([xip]\)[ \t])[ \t]*,[ \t]\([^ ]*\)/\t\4,\t[\te\2\3\t+\t0x\1\t]/g
/-\t0x/ s/[\t]e\([abs]\)\([xip]\)[\t],[\t]-[\t]0x\([^ ]*\)[ \t]([ \t]\([^ ]*\)[ \t])/\tdword\t[\t\4\t-\t0x\3\t]\t,\te\1\2/g
####################################################
# colapsing
####################################################
s/[\t]\([)(,*/+-]\)/\1/g
s/\([)(,*/+-]\)[ \t]*/\1/g
s/\[[\t]/\[/g
s/[\t]\]/\]/g
s/[\t]:/:/g
####################################################
# This normalizes everything
####################################################
:normliz
/:\t/ !s/^\([^\t;#][^\t]*\)/\t\1/
/:\t/ !s/^[\t]\([^\t+,-]*\)\([+-]\)/\t\1\t\2/g
/:\t/ !s/^[\t]\([^\t+,]*\)+%/\t\1\t+/g
/:\t/ s/^\([^\t]*\)[\t]\([^\t+,-]*\)\([+-]\)/\1 \2\t\3/g
/:\t/ s/^\([^\t]*\)[\t]\([^\t+,]*\)+%/\1\t\2\t+/g
s/+%\([^\t,]*\)/(\1)/g
s/[\t]+/\t/g
s/\([:,]\)+/\1/g
#####################################################
# changed below commented out what I dont know
#####################################################
:operand
/[.;#^+]/ !s/^ \([^\t]*\)[\t]\([^,]*\),\([^\t]*\)/\t\1\t\3,\2/
/[.;#]/ !s/^\([^\t][^\t]*\)[\t]\([^\t]*\)[\t]\([^,]*\),\([^\t]*\)/\1\t\2\t\4,\3/
######################################################
# I changed the to remove the b and x
#####################################################
:opcode
s/movl/mov/g
s/movb/mov/g
s/leal/lea/g
s/xorl/xor/g
s/popl/pop/g
s/pushl/push/g
s/pushw/push/g
s/incl/inc/g
s/xchgl/xchg/g
s/addl/add/g
s/cmpb/cmp/g
######################################################
# this shoves everything back
######################################################
:comment
s/[\t]*$//
# this shoves the whole thing back to the left
x
# this removed everything and left ;
/;/ !s/^.*$//
# this added another ;
s/\([\t]*\);/;\1;/
# this removed the ;
s/^[^;]*;//
# everythig is back
x
# this added a space between the lines
G
# then puts it back together removing lines spaces
s/\n//
# changes the ; at the end with a /
s/;/\//
Now that we can see that it works lets give it a shot on another script that we can find but I will go simple since I am sure it is missing a lot so we will use code at http://www.shell-storm.org/shellcode/files/shellcode-632.php a ping shellcode. So we will use the sed script and compile and see what happens but first we need to get the file into something workable. Like in the first post we need to strip the other junk out of the code. First lets copy just the assembly portion to a file named 632.asm then we will strip the code to what we need.
cat 632.asm | cut -d '/' -f3 > 632a.asmThis will leave us with a file named 632a.asm that will look like this:
push $0xbNow we have it in a working format that our script will be able to work with.
pop %eax
cltd
push %edx
push $0x20207473
push $0x6f686c61
push $0x636f6c20
push $0x676e6970
mov %esp,%esi
push %edx
pushw $0x632d
mov %esp,%ecx
push %edx
push $0x68732f2f
push $0x6e69622f
mov %esp,%ebx
push %edx
push %esi
push %ecx
push %ebx
mov %esp,%ecx
int $0x80
#tr "[A-Z]" "[a-z]" < 632a.asm | sed -f att2intel.sed > 632i.asm
# nasm -f elf 632i.asm
# ld -o 632i 632i.o
# ./632iWe have Intel code execution from converted ATT syntax. I realize there are several limitations to the sed script but I just wanted to do a little 1-1 comparison and I will update the above script as I make some changes like you will see that I used clear tabs sometimes and [ \t] others and will convert to \t for copy paste ability.
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.195 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.034 ms
Now that we have the scripts to convert back and forth we can really start to have fun. In our Arsenal we now have the ability to:
-Convert pure shell code to local binary for testing
-Retrieve the shellcode from a binary
-Convert AT&T code to Intel for nasm to get the shellcode
-Compile AT&T syntax code.
I believe this gives a great start to testing and using shellcode or assembly that you find on the interwebz when you are in a pinch , or if you find some really cool code that you might want to use that automated tools cannot produce shellcode for.
References and further reading:
http://wiki.osdev.org/Opcode_syntax
http://asm.sourceforge.net/articles/linasm.html (from Tyler's Comment)
http://www.ibm.com/developerworks/linux/library/l-gas-nasm/index.html
http://banisterfiend.wordpress.com/2008/08/17/att-vs-intel-assembly-syntax/
http://www.delorie.com/djgpp/mail-archives/djgpp/1995/06/06/05:48:34 (original sed)
http://asm.sourceforge.net/intro/hello.html
Again a very good post! thank you very much...
ReplyDeletesome questions...
why the ping shellcode when i compile it in linux it spits a segmantation fault? I compile it aslo in ubuntu in which the author of the shellcode tested the shellcode.Shouldnt just work at least in ubuntu?
the opcodes between at&t and intel syntax is the same?
if we have only a hex shellcode how can we deal with it?
Thanks and keep up the good work!
Thanks Chris for the comment,
ReplyDeleteI changed a few things in the post to help with making sure you are on the right track with getting this working regarding what the file should look like before you compile.
Also the exact system that I am using is the 32bit version of backtrack5r1 if you want to replicate the same environment.
As a final note there is a difference in nasm's that I should have explained in the previous post about the "push immediate word/ byte" a push 0x66 in version 2.07 is a 68 in version 2.09.04 it is a 6a. So after you have compiled the program make sure you check your work with objdump or a hexeditor to see if the shellcode matches up. Good luck
I also removed all the clear tabs in the sed script just in case the copy and paste does not translate it and added a revision so you can see when I updated the script. I will make this available as a download also.
ReplyDelete