Uncompyle6 - A Cross-Version Python Bytecode Decompiler


H5N1 native Python cross-version decompiler too fragment decompiler. The successor to decompyle, uncompyle, too uncompyle2.


Introduction
uncompyle6 translates Python bytecode dorsum into equivalent Python source code. It accepts bytecodes from Python version 1.3 to version 3.8, spanning over 24 years of Python releases. We include Dropbox's Python 2.5 bytecode too some PyPy bytecode.


Why this?
Ok, I'll say it: this software is amazing. It is to a greater extent than than your normal hacky decompiler. Using compiler technology, the plan creates a parse tree of the plan from the instructions; nodes at the upper levels that expect a piddling similar what powerfulness come upward from a Python AST. So nosotros tin give notice actually sort too empathise what's going on inwards sections of Python bytecode.
Building on this, some other affair that makes this unlike from other CPython bytecode decompilers is the powerfulness to deparse only fragments of source code too give source-code information around a given bytecode offset.
I job the tree fragments to deparse fragments of code at run time within my trepan debuggers. For that, bytecode offsets are recorded too associated alongside fragments of the source code. This purpose, although compatible alongside the original intention, is yet a piddling fleck different. See this for to a greater extent than information.
Python fragment deparsing given an pedagogy starting fourth dimension is useful inwards showing stack traces too tin give notice live encorporated into whatever plan that wants to present a place inwards to a greater extent than item than only a line of piece of work number at runtime. This code tin give notice live also used when source-code information does non be too at that spot is only bytecode. Again, my debuggers brand job of this.
There were (and soundless are) a number of decompyle, uncompyle, uncompyle2, uncompyle3 forks around. Almost all of them come upward basically from the same code base, too (almost?) all of them are no longer actively maintained. One was actually expert at decompiling Python 1.5-2.3 or so, some other actually expert at Python 2.7, but that only. Another handles Python 3.2 only; some other patched that too handled exclusively 3.3. You instruct the idea. This code pulls all of these forks together too moves forward. There is some serious refactoring too cleanup inwards this code base of operations over those one-time forks.
This demonstrably does the best inwards decompiling Python across all Python versions. And fifty-fifty when at that spot is some other projection that exclusively provides decompilation for subset of Python versions, nosotros by too large practise demonstrably ameliorate for those every bit well.
How tin give notice nosotros tell? By taking Python bytecode that comes distributed alongside that version of Python too decompiling these. Among those that successfully decompile, nosotros tin give notice thence brand certain the resulting programs are syntactically right past times running the Python interpreter for that bytecode version. Finally, inwards cases where the plan has a examination for itself, nosotros tin give notice run the banking concern stand upward for on the decompiled code.
We are serious most testing, too job automated processes to reveal bugs. In the number trackers for other decompilers, yous volition reveal a number of bugs we've constitute along the way. Very few to none of them are fixed inwards the other decompilers.


Requirements
The code hither tin give notice live run on Python versions 2.6 or later, PyPy 3-2.4, or PyPy-5.0.1. Python versions 2.4-2.7 are supported inwards the python-2.4 branch. The bytecode files it tin give notice read direct hold been tested on Python bytecodes from versions 1.4, 2.1-2.7, too 3.0-3.8 too the above-mentioned PyPy versions.


Installation
This uses setup.py, thence it follows the touchstone Python routine:
pip install -e .  # laid upward to run from source tree                   # Or if yous desire to install instead python setup.py install # may demand sudo
H5N1 GNU makefile is also provided thence make install (possibly every bit root or sudo) volition practise the steps above.


Running Tests
make check
H5N1 GNU makefile has been added to smoothen over setting running the right command, too running tests from fastest to slowest.
If yous direct hold remake installed, yous tin give notice come across the listing of all tasks including tests via remake --tasks


Usage
Run
$ uncompyle6 *compiled-python-file-pyc-or-pyo*
For usage help:
$ uncompyle6 -h


Verification
In older versions of Python it was possible to verify bytecode past times decompiling bytecode, too thence compiling using the Python interpreter for that bytecode version. Having done this the bytecode produced could live compared alongside the original bytecode. However every bit Python's code generation got better, this no longer was feasible.
If yous desire Python syntax verification of the correctness of the decompilation process, add together the --syntax-verify option. However since Python syntax changes, yous should job this selection if the bytecode is the right bytecode for the Python interpreter that volition live checking the syntax.
You tin give notice also cross compare the results alongside some other python decompiler similar pycdc . Since they run differently, bugs hither ofttimes aren't inwards that, too vice versa.
There is an interesting flat of these programs that is readily available give stronger verification: those programs that when run examination themselves. Our examination suite includes these.
And Python comes alongside some other a laid of programs similar this: its examination suite for the touchstone library. We direct hold some code inwards test/stdlib to facilitate this form of checking too.


Known Bugs/Restrictions
The biggest known too perchance fixable (but hard) job has to practise alongside treatment command flow. (Python has in all probability the most various too screwy laid of chemical compound statements I've e'er seen; at that spot are "else" clauses on loops too travail blocks that I suspect many programmers don't know about.)
All of the Python decompilers that I direct hold looked at direct hold problems decompiling Python's command flow. In some cases nosotros tin give notice reveal an erroneous decompilation too study that.
Python back upward is strongest inwards Python two for 2.7 too drops off every bit yous instruct farther away from that. Support is also in all probability pretty expert for python 2.3-2.4 since a lot of the goodness of early on the version of the decompiler from that era has been preserved (and Python compilation inwards that era was minimal)
There is some run to practise on the lower destination Python versions which is to a greater extent than hard for us to grip since nosotros don't direct hold a Python interpreter for versions 1.6, too 2.0.
In the Python three series, Python back upward is is strongest around 3.4 or 3.3 too drops off every bit yous motion farther away from those versions. Python 3.0 is weird inwards that it inwards some ways resembles 2.6 to a greater extent than than it does 3.1 or 2.7. Python 3.6 changes things drastically past times using discussion codes rather than byte codes. As a result, the boundary starting fourth dimension plain inwards a boundary pedagogy declaration has been reduced. This makes the EXTENDED_ARG instructions are instantly to a greater extent than prevalent inwards boundary instruction; previously they had been rare. Perhaps to compensate for the additional EXTENDED_ARG instructions, additional boundary optimization has been added. So inwards amount treatment command menstruum past times advertizement hoc agency every bit is currently done is worse.
Between Python 3.5, 3.6 too 3.7 at that spot direct hold been major changes to the MAKE_FUNCTION too CALL_FUNCTION instructions.
Currently non all Python magic numbers are supported. Specifically inwards some versions of Python, notably Python 3.6, the magic number has changes several times within a version.
We back upward exclusively released versions, non candidate versions. Note yet that the magic of a released version is normally the same every bit the last candidate version prior to release.
There are also customized Python interpreters, notably Dropbox, which job their ain magic too encrypt bytcode. With the exception of the Dropbox's one-time Python 2.5 interpreter this form of affair is non handled.
We also don't grip PJOrion obfuscated code. For that try: PJOrion Deobfuscator to unscramble the bytecode to instruct valid bytecode earlier trying this tool. This plan can't decompile Microsoft Windows EXE files created past times Py2EXE, although nosotros tin give notice in all probability decompile the code after yous extract the bytecode properly. For situations similar this, yous powerfulness desire to consider a decompilation service similar Crazy Compilers. Handling pathologically long lists of expressions or statements is slow.
There is lots to do, thence delight dig inwards too help.


See Also
  • https://github.com/zrax/pycdc : purports to back upward all versions of Python. It is written inwards C++ too is most accurate for Python versions around 2.7 too 3.3 when the code was to a greater extent than actively developed. Accuracy for to a greater extent than recent versions of Python three too early on versions of Python are particularly lacking. See its issue tracker for details. Currently lightly maintained.
  • https://code.google.com/archive/p/unpyc3/ : supports Python 3.2 only. The inwards a higher house projects job a unlike decompiling technique than what is used here. Currently unmaintained.
  • https://github.com/figment/unpyc3/ : fork of above, but supports Python 3.3 only. Includes some fixes similar supporting constituent annotations. Currently unmaintained.
  • https://github.com/wibiti/uncompyle2 : supports Python 2.7 only, but does that fairly well. There are situations where uncompyle6 results are wrong piece uncompyle2 results are not, but to a greater extent than ofttimes uncompyle6 is right when uncompyle2 is not. Because uncompyle6 adheres to accuracy over idiomatic Python, uncompyle2 tin give notice hit to a greater extent than natural-looking code when it is correct. Currently uncompyle2 is lightly maintained. See its number tracker for to a greater extent than details
  • How to study a bug
  • The HISTORY file.
  • https://github.com/rocky/python-xdis : Cross Python version disassembler
  • https://github.com/rocky/python-xasm : Cross Python version assembler
  • https://github.com/rocky/python-uncompyle6/wiki : Wiki Documents which depict the code too aspects of it inwards to a greater extent than detail