Skip to main content

Anti-Disassembly

  • Look at the disassembled instructions and start to understand the program
  • But malware authors use the same tools as malware analysts
  • They use anti-disassembly techniques to stop us analysing the code

Anti-Disassembly

  • Aims is to slow down the analyst

  • Make the job of disassembly as hard as possible

  • Reality is that any code that can be executed can be disassembled

  • Malware authors aim is to push the skill required up as much as possible

  • Same techniques also help prevent anti-virus heuristics from analysing the code too

  • Not talking about trying to camouflage code or data

  • Will see ways to do that

  • Rather looking at how code can be written

    • So the disassembler misinterprets it
    • But the CPU can still execute it without any issue

Assembly

  • Assembler is relatively straight-forward
  • Take an instruction, convert it into the relevant machine code opcodes
  • Move onto the next instruction
  • Result is a stream of bytes for the CPU to execute

Disassembly

  • Harder problem
  • Easy to convert a single instruction, once we know where it starts
  • Program is a stream of bytes, one after the other
  • x86 instructions can vary in length from 1-15 bytes
  • Cannot decode an instruction, until we know where it starts
  • Cannot know where an instruction starts, until we decode the previous instruction

Anti-Disassembly

  • Key to understanding the techniques
  • Understand they are trying to force disassembler to make the wrong decisions about where instructions start
  • So disassemble phoney instructions made up of bytes from the middle of real instructions
  • Need to understand how a disassembler is implemented
  • CPU dependent

Disassembly Techniques

  • Take in one or more bytes (depending on the length)
  • Convert that to the assembly code that would generate that instruction
  • Move onto the next instruction
  • Repeat But next instruction might not be correct

Two approach generally used to determine next instruction; linear disassembly or flow disassembly Flow disassembly generally gives better results than linear.

Linear disassembly

Method

  • Decode first instruction at start address
  • Then know the length (in bytes) of that instruction
  • Add length to address of that instruction to give address of next instruction
  • Repeat

Thoughts

  • Generally works
  • Can go wrong even with non-malicious binaries
  • Tempting to think that the .text section will only contain code
  • Not necessarily the case, sometimes data is intermingled
  • Linear disassembler would treat....

Breaking

  • Very easy to break a linear disassembler

Flow Disassembly

  • Aims to improve the detection of instructions (and rejection of data)
  • Follows the flow of the program rather than linearly processing instructions

Instruction Flow

Break instructions down into groups based on what instruction will execute next

  • Always execute the next instruction

  • Can execute either the next instruction of a specified one (call, jz,jnz)

  • Always execute instructions at a specified address e.g. jmp

  • Next instruction unknown, e.g. ret

  • For simple instructions works just like linear disassembly