Basics of Assembly – Part 1

Indeed: the basics!! Before I start out with something really technical, I thought to clear all the basics which are neededBasics of Assembly – Part 1 for anyone who is new to reversing. What I am going to teach here is far from being complete but it will be covering  almost everything which you will need later on. To being with, An Assembler is the start and the end of all programming languages & that’s not an exaggeration. To my knowledge, all the computer languages are translated to binary & can be decompiled /disassembled in assembly. You might be having some programming experience in high level languages like C/C++, Java, .NET, which have relatively clear syntaxes, but when it comes to assembly (& LISP..i will come to it some time later) its a different ball game altogether. Assembly is the world of mnemonics, numbers &   abbreviations and numbers and that’s where it all turns sour for many of us...But trust me, this is a basic & simple guide to assembly & you will be able to quickly grasp basics of it.

PS: all the values which we will be talking about from now on will be in Hexadecimal...Unless specified :P I will be covering Bits & bytes & registers this time..

I. Starting with Bits and bytes:

BIT - The smallest possible piece of data in computing can be either 0 or a 1. Put a bunch of bits together & tada..You will have a 'binary number system'

For e.g. 
00000001 = 1       00000010 = 2             00000011 = 3     etc.



BYTE – A byte has 8 bits & can have a maximal value of 255 (0-255). We use the 'hexadecimal number system' for an easier reading of binary number system which is a 'base-16 system', while binary is a 'base-2 system'




  • WORD –A word = 2 bytes put together or 16 bits & can have a maximal value of 0FFFFh (or 65535d).


  • DOUBLE WORD –A double word = 2 words together or 32 bits & can have a max value = 0FFFFFFFF (or 4294967295d).


  • KILOBYTE –1000 bytes?! Nah, it’s actually 1024 bytes.


  • MEGABYTE –Again, not just 1 million bytes, but 1024*1024 or 1,048,578 bytes or 1024 KB.



II. A case of Registers:



Registers can be viewed as a placeholder in memory where we can put something; in simpler terms these are “special places” in your computer's memory where we can store data. View it as a little box, where we can put something: a name, a number, a sentence. Fact: Today’s WinTel (windows + Intel) CPU’s have 9 registers of 32 bit



Their names are:




EAX:     Extended Accumulator Register	        EBX:	 Extended Base Register
ECX: Extended Counter Register EDX: Extended Data Register
EDI: Extended Destination Index ESI: Extended Source Index
EBP: Extended Base Pointer ESP: Extended Stack Pointer
EIP: Extended Instruction Pointer



Generally the size of the registers is 32bit (=4 bytes) & they can hold data from 0-FFFFFFFF (unsigned). In earlier days registers did what their name meant...Like ECX = Counter, but nowadays you can almost use any register you like for a counter or stuff (except counter functions, which I will be covering as we progress). There's one more thing you have to know about registers: although they are all 32bits large, some parts of them (16bit or even 8bit) can not be addressed directly as modern processors work n 32 bit protected mode.


The possibilities are:




32bit Register	16bit 	8bit 
EAX AX     AH/AL
EBX BX     BH/BL
ECX CX     CH/CL
EDX DX     DH/DL
ESI SI     -----
EDI DI     -----
EBP             BP     -----
ESP SP     -----
EIP             IP     -----



To understand the above table lets consider a fictional value (my birthdate) as an example in hexadecimal & store it in register as –




30 Oct 1989





This can be written as:  30101989



Converting it in hexadecimal: 0x30101989



EAX = 30101989




Now as EAX (Extended AX) is 32 bit so it can store 30101989, therefore it consists of 2 AX registers which are of 16 bit each:




AX	AX
3010 1989



 


& each AX is made of A H (Accumulator high) & A L (Accumulator Low) registers of value 8 bit each, taking AX = 3010 as example -




AH     AL

30     10




Similarly for AX = 1989




AH	AL
19 89



So we can say EAX is the name of the 32bit register, AX is the name of the "Low Word" (16bit) of EAX and AL/AH (8bit) are the “names” of the "Low Part" and “High Part” of AX. By the way if you have not forgot  , 4 bytes is 1 DWORD, 2 bytes is 1 WORD.



Please Note: make sure you at least read the following about registers. It’s quite practical to know it although not that important. Also, the coming section may be a bit cryptic but will be explained in the followup tutorial.Since you are clear with the above, I guess we can make a distinction regarding size:



Byte-size registers: As the name says, these register are all exactly 1 byte (8 bits) in size & obviously it doesn’t means that the whole (32bit) register is fully loaded with data! Empty spaces in a register are just filled with zeroes.




AL and AH	BL and BH
CL and CH DL and DH



Word-size registers: Are 1 word (= 2 bytes = 16 bits) in size. A word-sized register is constructed of 2 byte-sized registers. Again, we can Segment registers:their purpose:




  • General purpose registers:




AX - ‘accumulator’:		used to do mathematical operation & store strings.
BX - 'base': used in conjunction with the stack
CX - 'counter' used to count a value a number of times
DX - 'data': here the remainder of mathematical operations is stored
DI - 'destination index': i.e. a string will be copied to DI
SI - 'source index': i.e. a string will be copied from SI




  • Index registers:




BP	-	'base pointer' : points to a specified position on the stack 

SP - 'stack pointer': points to a specified position on the stack




  • Segment registers:




CS	-	 'code segment' :	instructions an application has to execute 
DS - 'data segment' : the data your application needs
ES - 'extra segment': points to the active extra-segment
SS - 'stack segment': here we'll find the stack




  • Special: IP   -   'instruction pointer':   points to the next instruction. Just leave it alone



Double-word size registers: If you find an 'E' in front of a 16-bits register, it means that you are dealing with a 32-bits register. So, AX = 16-bits; EAX = the 32-bits version of EAX.



I believe I have covered registers this time, I will be covering Stack & instructions in my next article. Stay tuned & keep reversing



Like This post ?  You can buy me a Beer :)



Posted by XERO. ALL RIGHTS RESERVED.