Anti-Disassembly

Look at the disassembled instructions and start to understand the program
But malware authors use the same tools as malware analysts
They use anti-disassembly techniques to stop us analysing the code

Anti-Disassembly

Aims is to slow down the analyst
Make the job of disassembly as hard as possible
Reality is that any code that can be executed can be disassembled
Malware authors aim is to push the skill required up as much as possible
Same techniques also help prevent anti-virus heuristics from analysing the code too
Not talking about trying to camouflage code or data
Will see ways to do that
Rather looking at how code can be written
- So the disassembler misinterprets it
- But the CPU can still execute it without any issue

Assembly

Assembler is relatively straight-forward
Take an instruction, convert it into the relevant machine code opcodes
Move onto the next instruction
Result is a stream of bytes for the CPU to execute

Disassembly

Harder problem
Easy to convert a single instruction, once we know where it starts
Program is a stream of bytes, one after the other
x86 instructions can vary in length from 1-15 bytes
Cannot decode an instruction, until we know where it starts
Cannot know where an instruction starts, until we decode the previous instruction

Anti-Disassembly

Key to understanding the techniques
Understand they are trying to force disassembler to make the wrong decisions about where instructions start
So disassemble phoney instructions made up of bytes from the middle of real instructions
Need to understand how a disassembler is implemented
CPU dependent

Disassembly Techniques

Take in one or more bytes (depending on the length)
Convert that to the assembly code that would generate that instruction
Move onto the next instruction
Repeat But next instruction might not be correct

Two approach generally used to determine next instruction; linear disassembly or flow disassembly Flow disassembly generally gives better results than linear.

Linear disassembly

Method

Decode first instruction at start address
Then know the length (in bytes) of that instruction
Add length to address of that instruction to give address of next instruction
Repeat

Thoughts

Generally works
Can go wrong even with non-malicious binaries
Tempting to think that the .text section will only contain code
Not necessarily the case, sometimes data is intermingled
Linear disassembler would treat....

Breaking

Very easy to break a linear disassembler

Flow Disassembly

Aims to improve the detection of instructions (and rejection of data)
Follows the flow of the program rather than linearly processing instructions

Instruction Flow

Break instructions down into groups based on what instruction will execute next

Always execute the next instruction
Can execute either the next instruction of a specified one (call, jz,jnz)
Always execute instructions at a specified address e.g. jmp
Next instruction unknown, e.g. ret
For simple instructions works just like linear disassembly

Anti-Disassembly

Anti-Disassembly​

Assembly​

Disassembly​

Anti-Disassembly​

Disassembly Techniques​

Linear disassembly​

Method​

Thoughts​

Breaking​

Flow Disassembly​

Instruction Flow​

Anti-Disassembly

Assembly

Disassembly

Anti-Disassembly

Disassembly Techniques

Linear disassembly

Method

Thoughts

Breaking

Flow Disassembly

Instruction Flow