PDF Analysis Tool - peepdf



peepdf is a Python tool to explore PDF files in order to find out if the file can be harmful or not.

The aim of this tool is to provide all the necessary components that a security researcher could need in a PDF analysis without using 3 or 4 tools to make all the tasks. With peepdf it’s possible to see all the objects in the document showing the suspicious elements, supports all the most used filters and encodings, it can parse different versions of a file, object streams and encrypted files. With the installation of Spidermonkey and Libemu it provides Javascript and shellcode analysis wrappers too. Apart of this it’s able to create new PDF files and to modify existent ones.


Functionalities:

Analysis:

ºDecodings: hexadecimal, octal, name objects
ºMore used filters
ºReferences in objects and where an object is referenced
ºStrings search (including streams)
ºPhysical structure (offsets)PDF Analysis Tool: peepdf
ºLogical tree structure
ºMetadata
ºModifications between versions (changelog)
ºCompressed objects (object streams)
ºAnalysis and modification of Javascript (Spidermonkey): unescape, replace, join
ºShellcode analysis (Libemu python wrapper, pylibemu)
ºVariables (set command)
ºExtraction of old versions of the document
ºEasy extraction of objects, Javascript code, shellcodes (>, >>, $>, $>>)
ºChecking hashes on VirusTotal

Creation/Modification:

ºBasic PDF creation
ºCreation of PDF with Javascript executed wen the document is opened
ºCreation of object streams to compress objects
ºEmbedded PDFs
ºStrings and names obfuscation
ºMalformed PDF output: without endobj, garbage in the header, bad header…
ºFilters modification
ºObjects modification

Execution modes:

ºSimple command line execution
ºPowerful interactive console (colorized or not)
ºBatch mode