Processes 1
Data Encoding
- Malware needs to hide its intent
- Applies to both its operation (i.e. its code) but also the data it uses
- Data encoding refers to any form of content modification used for the purpose of hiding intent
- Need to understand the encoding techniques used to understand what the malware does
Base 64 Encoding
Aims to translate binary data into ASCII characters that can be transmitted via email
Less usable ASCII characters than possible values in a byte so encoded form larger than original
Uses 64 characters
Padded with
=
character if not a multiple of three bytesConverts three bytes of data into four Base64 characters
Take the bytes to encode in groups of threes and pack as a 24-bit number
First byte becomes the most significant eight bits, second byte goes in the middle eight bits, and third byte becomes LSB.
Identifying Base64 Encoding
- Appears as a random selection of characters
- Also relatively straight-forward to identify the encoding/decoding routines in malware
- Another tell-tale sign is that the encode/decode functions will often contain a lone-padding character (
=
)
Decoding Modified Base64
- Still relatively straight-forward to decode by using simple character substitution
- Find the look-up table used by the malware
- Take the encoded data, and convert each character from the malwares lookup table to the normal one
- Then feed into the standard decode routine
Standard Cryptography
Custom Encoding
- Malware often uses homegrown encoding schemes
- One mechanism used to develop them in to layer simple encoding techniques
- Would be to develop a custom algorithm
- Custom encoding techniques retain the characteristics of simple encoding techniques
- Make the job of reverse-engineering much harder
- Need to identify the encoding process and develop a decoder
- Normal cryptographic routines require a key to work
- But with a custom routine the key is embedded in the code
Cracking Custom Encoding
- Need to trace the programs execution to find the routines
- Think about how the malware will use the custom encoding
- If encoding - routines will be called before the data is output
- If decoding - routines will be called after the data is input
- Once code is found, need to develop a decoder
- Could implement...
Processes
- By exploiting other legitimate process
- Not easily visible from outside
- Also gains privileges of the process it is injected into
DLL
- Dynamic Link Library used to contain malware code
- Designed to be linked into other programs
- Gains the privileges of the program
- DLL search hijacking - exploit the way Windows searches for a DLL
- Requires the malware to have same name as a legitimate DLL