• Stars
    star
    561
  • Rank 79,400 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Extract and Deobfuscate XLM macros (a.k.a Excel 4.0 Macros)

XLMMacroDeobfuscator

XLMMacroDeobfuscator can be used to decode obfuscated XLM macros (also known as Excel 4.0 macros). It utilizes an internal XLM emulator to interpret the macros, without fully performing the code.

It supports both xls, xlsm, and xlsb formats.

It uses xlrd2, pyxlsb2 and its own parser to extract cells and other information from xls, xlsb and xlsm files, respectively.

You can also find XLM grammar in xlm-macro-lark.template

Installing the emulator

  1. Install using pip
pip install XLMMacroDeobfuscator --force

or

pip install xlmmacrodeobfuscator[defusedxml] --force
  1. Installing the latest development
pip install -U https://github.com/DissectMalware/XLMMacroDeobfuscator/archive/master.zip --force

Running the emulator

To deobfuscate macros in Excel documents:

xlmdeobfuscator --file document.xlsm

To only extract macros in Excel documents (without any deobfuscation):

xlmdeobfuscator --file document.xlsm -x

To only get the deobfuscated macros and without any indentation:

xlmdeobfuscator --file document.xlsm --no-indent --output-formula-format "[[INT-FORMULA]]"

To export the output in JSON format

xlmdeobfuscator --file document.xlsm --export-json result.json

To see a sample JSON output, please check this link out.

To use a config file

xlmdeobfuscator --file document.xlsm -c default.config

default.config file must be a valid json file, such as:

{
	"no-indent": true,
	"output-formula-format": "[[CELL-ADDR]] [[INT-FORMULA]]",
	"non-interactive": true,
	"output-level": 1
}

Command Line


          _        _______
|\     /|( \      (       )
( \   / )| (      | () () |
 \ (_) / | |      | || || |
  ) _ (  | |      | |(_)| |
 / ( ) \ | |      | |   | |
( /   \ )| (____/\| )   ( |
|/     \|(_______/|/     \|
   ______   _______  _______  ______   _______           _______  _______  _______ _________ _______  _______
  (  __  \ (  ____ \(  ___  )(  ___ \ (  ____ \|\     /|(  ____ \(  ____ \(  ___  )\__   __/(  ___  )(  ____ )
  | (  \  )| (    \/| (   ) || (   ) )| (    \/| )   ( || (    \/| (    \/| (   ) |   ) (   | (   ) || (    )|
  | |   ) || (__    | |   | || (__/ / | (__    | |   | || (_____ | |      | (___) |   | |   | |   | || (____)|
  | |   | ||  __)   | |   | ||  __ (  |  __)   | |   | |(_____  )| |      |  ___  |   | |   | |   | ||     __)
  | |   ) || (      | |   | || (  \ \ | (      | |   | |      ) || |      | (   ) |   | |   | |   | || (\ (
  | (__/  )| (____/\| (___) || )___) )| )      | (___) |/\____) || (____/\| )   ( |   | |   | (___) || ) \ \__
  (______/ (_______/(_______)|/ \___/ |/       (_______)\_______)(_______/|/     \|   )_(   (_______)|/   \__/

    
XLMMacroDeobfuscator(v0.2.0) - https://github.com/DissectMalware/XLMMacroDeobfuscator

Error: --file is missing

usage: deobfuscator.py [-h] [-c FILE_PATH] [-f FILE_PATH] [-n] [-x]
                       [--sort-formulas] [--defined-names] [-2]
                       [--with-ms-excel] [-s] [-d DAY]
                       [--output-formula-format OUTPUT_FORMULA_FORMAT]
                       [--extract-formula-format EXTRACT_FORMULA_FORMAT]
                       [--no-indent] [--silent] [--export-json FILE_PATH]
                       [--start-point CELL_ADDR] [-p PASSWORD]
                       [-o OUTPUT_LEVEL] [--timeout N]

optional arguments:
  -h, --help            show this help message and exit
  -c FILE_PATH, --config-file FILE_PATH
                        Specify a config file (must be a valid JSON file)
  -f FILE_PATH, --file FILE_PATH
                        The path of a XLSM file
  -n, --noninteractive  Disable interactive shell
  -x, --extract-only    Only extract cells without any emulation
  --sort-formulas       Sort extracted formulas based on their cell address
                        (requires -x)
  --defined-names       Extract all defined names
  -2, --no-ms-excel     [Deprecated] Do not use MS Excel to process XLS files
  --with-ms-excel       Use MS Excel to process XLS files
  -s, --start-with-shell
                        Open an XLM shell before interpreting the macros in
                        the input
  -d DAY, --day DAY     Specify the day of month
  --output-formula-format OUTPUT_FORMULA_FORMAT
                        Specify the format for output formulas ([[CELL-ADDR]],
                        [[INT-FORMULA]], and [[STATUS]]
  --extract-formula-format EXTRACT_FORMULA_FORMAT
                        Specify the format for extracted formulas ([[CELL-
                        ADDR]], [[CELL-FORMULA]], and [[CELL-VALUE]]
  --no-indent           Do not show indent before formulas
  --silent              Do not print output
  --export-json FILE_PATH
                        Export the output to JSON
  --start-point CELL_ADDR
                        Start interpretation from a specific cell address
  -p PASSWORD, --password PASSWORD
                        Password to decrypt the protected document
  -o OUTPUT_LEVEL, --output-level OUTPUT_LEVEL
                        Set the level of details to be shown (0:all commands,
                        1: commands no jump 2:important commands 3:strings in
                        important commands).
  --timeout N           stop emulation after N seconds (0: not interruption
                        N>0: stop emulation after N seconds)

Library

The following example shows how XLMMacroDeobfuscator can be used in a python project to deobfuscate XLM macros:

from XLMMacroDeobfuscator.deobfuscator import process_file

result = process_file(file='path/to/an/excel/file', 
            noninteractive= True, 
            noindent= True, 
            output_formula_format='[[CELL-ADDR]], [[INT-FORMULA]]',
            return_deobfuscated= True,
            timeout= 30)

for record in result:
    print(record)
  • note: the xlmdeofuscator logo will not be shown when you use it as a library

Requirements

Please read requirements.txt to get the list of python libraries that XLMMacroDeobfuscator is dependent on.

xlmdeobfuscator can be executed on any OS to extract and deobfuscate macros in xls, xlsm, and xlsb files. You do not need to install MS Excel.

Note: if you want to use MS Excel (on Windows), you need to install pywin32 library and use --with-ms-excel switch. If --with-ms-excel is used, xlmdeobfuscator, first, attempts to load xls files with MS Excel, if it fails it uses xlrd2 library.

Project Using XLMMacroDeofuscator

XLMMacroDeofuscator is adopted in the following projects:

Please contact me if you incorporated XLMMacroDeofuscator in your project.

How to Contribute

If you found a bug or would like to suggest an improvement, please create a new issue on the issues page.

Feel free to contribute to the project forking the project and submitting a pull request.

You can reach me (@DissectMlaware) on Twitter via a direct message.

More Repositories

1

batch_deobfuscator

Deobfuscate batch scripts obfuscated using string substitution and escape character techniques.
Python
138
star
2

pyOneNote

A python library to parse OneNote (.one) files
Python
110
star
3

MalwareCMDMonitor

Shows command lines used by latest instances analyzed on Hybrid-Analysis
Python
44
star
4

base64_substring

Generate a Yara rule to find base64-encoded files containg a specific keyword
Python
40
star
5

yaradbg-frontend

JavaScript
36
star
6

ClipboardWatcher

Monitor the textual data pasted into Windows clipboard
C#
29
star
7

OfficeForensicTools

A set of tools for collecting forensic information
Python
25
star
8

PySameSame

This is a python version of samesame repo to generate homograph strings
HTML
24
star
9

xlrd2

xlrd2 is a variant of xlrd that is actively maintained
Python
24
star
10

yaradbg-backend

Python
24
star
11

WinNativeIO

Using Undocumented NTDLL Functions to Read/Write/Delete File
C++
20
star
12

pyxlsb2

an Excel 2007+ Binary Workbook (xlsb) parser for Python
Python
19
star
13

MDIExtractor

Python
15
star
14

npp-langs-4-sec

Notepad++ Syntax Highlighting for Languages Used by Cyber Security Professionals
15
star
15

IoCMiner

A Framework to Automatically Extract Indicators of Compromise (IoCs) from Twitter
Python
14
star
16

PhishCanary

Given a TLD zone file, PhishCanary extracts International Domain Names (IDNs) that are homoglyphs of specified target domain names.
Python
10
star
17

yaradbg-issues

7
star
18

yaradbg-container

A docker config file to run yaradbg in a container
Dockerfile
5
star
19

TLDExtractor

Accurately extract TLD, effective TLD, 2LD, 3LD, ... from a given domain name; by utilizing the Public Suffix List maintained by Mozilla Foundation
C#
3
star
20

document-samples

HTML
1
star