Reverse Engineering Malware, Part 2: Assembler Language Basics
Most of the work we will be doing in reverse engineering will be with assembler language. This simple and sometimes tedious language can reveal a plethora of information on the source code. When we can't see or recover the source code of the malware or other software, we can use tools such as dis-assemblers and debuggers to recover the underlying assembler of the software. From there, of course, we can then decipher what the software was attempting to do.
In this tutorial, I will simply be listing the most basic and fundamental assembler instructions. I suspect most of you will simply use it a a reference as we progress though this study, so make certain to bookmark this page so that you can easily come back to it.
Pieces
Let's begin some every basic concepts. Hopefully, this all review for you, but if not, you need to understand these basic concepts before proceeding in this course of study.
Bit- This is the smallest piece of data. It can be a 0 or 1 or Off or ON.
Byte- a byte is 8 bits. It has a range of equivalent decimal values of 0 to 255
Word- a word is two bytes together or 16 bits
Double Word- a double word is tow words or 32 bits
Kilobyte- a kilobyte is 1024 (32 * 32) bytes
Megabyte- a megabyte is is 1,048,578 bytes (1024 x 1024).
Registers
Registers are places in computer memory where data is stored. When working in the assembler, we are usually using these registers to move and manipulate information, so you should be familiar with them.
These registers are;
EAX- Extended Accumulator Register
EBX- Extended Base Register
ECX- Extended Counter Register
EDX- Extended Data Register
ESI- Extended Source Index
EDI- Extended Destination Index
EBP- Extended Base Pointer
ESP- Extended Stack Pointer
EIP- Extended Instruction Pointer
Flags
Flags are a single bit that indicates status of a register. The flag register on modern 32 bit CPU's is 32 bits long. There are 32 flags. In our studies here, we will only need three of them; (1) the Z flag, the O flag and the C flag.
A flag can only be SET or NOT SET
Z-Flag
The Z-flag (zero flag) is the most useful flag for cracking. It is used in about 90% of all cases. It can be set or cleared by several opcodes when the last instruction that was performed has 0 as a result
O- Flag
The O-flag (overflow flag) is used in about 4% of all cracking attempts. It is set when the last operation changed the highest bit of the register that gets the result of an operation.
C -Flag
The C-Flag (carry Flag) is used in about 1% of all cracking attempts. It is set, if you add a value to a register, so that it gets bigger than FFFFFFFF or is you subtract a value, so that the register value is less than zero.
Stack
The stack is a part of memory where you can store different things for late use. Like a stack of books on a desk where the last on top (last in or LI) is the first to leave (LIFO).
The command PUSH saves the contents of a register on the stack. The command POP grabs the last saved contents of a register from the stack and then places it into a specific register.
Instructions
Assembler language has a small number of fundamental commands. These include;
ADD- The ADD instruction adds a value to a register or memory address.
Syntax:
ADD destination, source
AND- the AND instruction uses a logical and on two values
Syntax:
AND destination, source
CALL- the CALL instruction pushes the Relative Virtual Address (RVA) of the instruction that follows to the stack and calls a subprogram or sub-procedure
Syntax:
CALL something
CDQ- Convert DWORD to QWORD (ConvertDtoQ)
Syntax:
CDQ
CMP- Compare
the CMP instruction compares two things and can set the C/O/Z flags if the result of the compare fits
Syntax:
CMP destination, source
DEC- decrement
the decrement command is used to decrease a value
decreases a value (value= value -1 )
Syntax:
DEC something
DIV- division
the DIV command is used to divide EAX through a divisor. The dividend is always EAX, the result is stored in EAX and the modulus is stored in EDX.
Syntax:
DIV divisor
IDIV- Integer division. Signed division and may set C/O/Z flags
Syntax:
IDIV divisor
IMUL- integer multiplication
Synatx:
IMUL value
IMUL dest, value, value
IMUL dest, value
INC- increment, opposite of DEC instruction (value = value +1)
Syntax:
INC register
INT- the INT command generates a call to an interrupt handler
JUMPS- there are a variety of jumps, but the most common and important jumps are;
JE - jump if equal
JG - jump if greater
JGE - jump if greater or equal
JL - jump if lesser
JLE - jump if less or equal
JMP - jump always
JNE - jump if not equal
JNZ - jump if not zero
JZ - jump if zero
LEA- load effective address
Syntax:
LEA destination, source
MOV- move copies the value from the source to the destination
Syntax:
MOV destination, source
MUL- multiply is the same as IMUL but it multiplies unsigned
Syntax:
MUL value
NOP- no operation does nothing
Syntax:
NOP
OR- logical inclusive OR
Syntax:
OR destination, source
POP- the POP instruction loads the value of the byte/word/dword pointer (ESP) and puts it into the destination.
Syntax:
POP destination
PUSH- the PUSH instruction stores a value on the stack and decreases it by the size of the operand that was pushed, so that the ESP points to the value that was PUSHed.
Syntax:
PUSH operand
REP- repeat following string instruction. Common uses are REPE(repeat if equal), REPZ (repeat if zero), REPNE (repeat if non equal) and REPNZ (repeat if non zero)
Syntax:
REP ins
Where ins is a string operation
RET- return
Syntax:
RET digit
SUB- subtraction. Is the opposite of ADD command. Subtracts the value of source from the value of destination and stores the result in destination
Syntax:
SUB destination, source
TEST -it performs a logical AND but does not store the value
Syntax:
TEST operand1 , operand2
XOR- the XOR instruction connects two values using logical exclusive OR
Syntax:
XOR destination, source
Logical Operations
from hackers-arise full article here