• Stars
    star
    211
  • Rank 186,867 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created over 8 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

olefile is a Python package to parse, read and write Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office 97-2003 documents, vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix files, Outlook messages, StickyNotes, several Microscopy file formats, McAfee antivirus quarantine files, etc.

olefile

Build Status TravisCI Build Status AppVeyor Coverage Status Documentation Status PyPI Can I Use Python 3? Say Thanks!

olefile is a Python package to parse, read and write Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office 97-2003 documents, vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix files, Outlook messages, StickyNotes, several Microscopy file formats, McAfee antivirus quarantine files, etc.

Quick links: Home page - Download/Install - Documentation - Report Issues/Suggestions/Questions - Contact the author - Repository - Updates on Twitter

News

Follow all updates and news on Twitter: https://twitter.com/decalage2

  • 2018-09-09 v0.46: OleFileIO can now be used as a context manager (with...as), to close the file automatically (see doc). Improved handling of malformed files, fixed several bugs.
  • 2018-01-24 v0.45: olefile can now overwrite streams of any size, improved handling of malformed files, fixed several bugs, end of support for Python 2.6 and 3.3.
  • 2017-01-06 v0.44: several bugfixes, removed support for Python 2.5 (olefile2), added support for incomplete streams and incorrect directory entries (to read malformed documents), added getclsid, improved documentation with API reference.
  • 2017-01-04: moved the documentation to ReadTheDocs
  • 2016-05-20: moved olefile repository to GitHub
  • 2016-02-02 v0.43: fixed issues #26 and #27, better handling of malformed files, use python logging.
  • see changelog for more detailed information and the latest changes.

Download/Install

If you have pip or setuptools installed (pip is included in Python 2.7.9+), you may simply run pip install olefile or easy_install olefile for the first installation.

To update olefile, run pip install -U olefile.

Otherwise, see http://olefile.readthedocs.io/en/latest/Install.html

Features

  • Parse, read and write any OLE file such as Microsoft Office 97-2003 legacy document formats (Word .doc, Excel .xls, PowerPoint .ppt, Visio .vsd, Project .mpp), Image Composer and FlashPix files, Outlook messages, StickyNotes, Zeiss AxioVision ZVI files, Olympus FluoView OIB files, etc
  • List all the streams and storages contained in an OLE file
  • Open streams as files
  • Parse and read property streams, containing metadata of the file
  • Portable, pure Python module, no dependency

olefile can be used as an independent package or with PIL/Pillow.

olefile is mostly meant for developers. If you are looking for tools to analyze OLE files or to extract data (especially for security purposes such as malware analysis and forensics), then please also check my python-oletools, which are built upon olefile and provide a higher-level interface.

Documentation

Please see the online documentation for more information.

Real-life examples

A real-life example: using OleFileIO_PL for malware analysis and forensics.

See also this paper about python tools for forensics, which features olefile.

License

olefile (formerly OleFileIO_PL) is copyright (c) 2005-2019 Philippe Lagadec (https://www.decalage.info)

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


olefile is based on source code from the OleFileIO module of the Python Imaging Library (PIL) published by Fredrik Lundh under the following license:

The Python Imaging Library (PIL) is

  • Copyright (c) 1997-2009 by Secret Labs AB
  • Copyright (c) 1995-2009 by Fredrik Lundh

By obtaining, using, and/or copying this software and/or its associated documentation, you agree that you have read, understood, and will comply with the following terms and conditions:

Permission to use, copy, modify, and distribute this software and its associated documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appears in all copies, and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of Secret Labs AB or the author not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission.

SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

More Repositories

1

awesome-security-hardening

A collection of awesome security hardening guides, tools and other resources
4,863
star
2

oletools

oletools - python tools to analyze MS OLE2 files (Structured Storage, Compound File Binary Format) and MS Office documents, for malware analysis, forensics and debugging.
Python
2,706
star
3

ViperMonkey

A VBA parser and emulation engine to analyze malicious macros.
Python
1,007
star
4

balbuzard

Balbuzard is a package of malware analysis tools in python to extract patterns of interest from suspicious files (IP addresses, domain names, known file headers, interesting strings, etc). It can also crack malware obfuscation such as XOR, ROL, etc by bruteforcing and checking for those patterns.
YARA
121
star
5

exefilter

ExeFilter is an open-source tool and framework to filter file formats in e-mails, web pages or files. It detects many common file formats and can remove active content (scripts, macros, etc) according to a configurable policy.
Python
63
star
6

oledump-contrib

The oledump-contrib repository contains plugins and enhancements for the oledump tool published by Didier Stevens.
Python
50
star
7

pyhtgen

pyhtgen (formerly HTML.py) provides a few classes to easily generate HTML content such as tables and lists.
HTML
12
star
8

oletools_dll

A DLL to run some oletools functions from any language
C
7
star
9

pywordform

pywordform is a python module to parse Microsoft Word forms in docx format, extractings all field values with their tags into a dictionary. For more information: http://www.decalage.info/python/pywordform
Python
7
star
10

python-crash-course

This is a Python course I have written to quickly teach Python to my colleagues and students, made of slides and samples for hands-on exercises. It takes around four to five hours to present all the slides and to run the hands-on exercises. The original course was based on my mini Python tutorial. (http://www.decalage.info/python/tutorial)
7
star
11

iodeflib

iodeflib is a python library to create, parse and edit cyber incident reports using the IODEF XML format (RFC 5070).
Python
6
star
12

pyxmldsig

pyxmldsig is a Python module to create and verify XML Digital Signatures (XML-DSig). This is a simple interface to the PyXMLSec library, aiming to provide a more pythonic API suitable for Python applications. See http://www.decalage.info/python/pyxmldsig
Python
2
star
13

cherryproxy

CherryProxy is a simple HTTP proxy written in Python 2.x, based on the CherryPy WSGI server and httplib, extensible for content analysis and filtering.
Python
2
star