Applepie - A Hypervisor For Fuzzing Built Amongst Whvp Together With Bochs
Hello! Welcome to applepie! This is a tool designed for fuzzing, introspection, together with finding bugs! This is a hypervisor using the Windows Hypervisor Platform API nowadays inwards recent versions of Windows (specifically this was developed together with tested on Windows 10 17763). Bochs is used for providing deep introspection together with device emulation.
The Windows Hypervisor Platform API (WHVP) is an API laid for accessing Hyper-V's hypervisor abilities. This API makes it slow for us to implement a virtual machine all inwards user-space without whatever particular drivers or permissions needed.
Recent Feature Demo
Binary Coverage Example
What is this for?
This is a tool designed for fuzzing together with introspection during safety research. By using a hypervisor mutual fuzzing techniques tin hold out applied to whatever target, core or userland. This surround allows fuzzing of whole systems without a require for source of the target. At the hypervisor marking code coverage tin hold out gathered, together with if needed Bochs emulation tin hold out used to render arbitrary introspection inwards an emulation environment. This coverage information tin hold out used to figure out the effectiveness of the fuzz cases. H5N1 fuzz illustration that caused an increment inwards coverage tin hold out saved every bit it was an interesting case. This input tin hold out used later, built on past times novel corruptions.
Snapshot fuzzing is the primary role of this tool. Where yous receive got a snapshot of a organization inwards a sure as shooting state, together with relieve it off. This snapshot tin so hold out loaded upwards for fuzzing, where a fuzz illustration is injected, together with it's resumed. Since the VM tin hold out reset really cheaply, the VM tin hold out reset often. If it takes Word v seconds to boot, but yous tin snapshot it correct every bit it reads your file, yous tin cutting the fuzz illustration downwardly to only what is relevant to an input. This allows for a really tight loop of fuzzing without needing to receive got access to source. Since the VM's are alone dissever systems, many tin hold out run inwards parallel to allow scaling to all cores.
Currently this tool only supports gathering code coverage, dynamic symbol downloading for Windows, together with symbol/module parsing for Windows targets every bit well. Adding fuzzing back upwards volition hold out quite soon.
Development cycle
Given I've written almost all the features hither earlier (coverage, fuzzing, fast resets, etc). I facial expression this projection should pretty rapidly give-up the ghost laid upwards for fuzzing, unless I larn distracted :D
I'm aiming for end-of-January for coverage (done!), feedback, module listings (done!), procedure lists, fast resets, together with symbol back upwards (done!). Which would larn inwards a really capable fuzzer.
OS Support
The principal supported target is modern Windows 10. Windows targets receive got downloading of symbols from the symbol store. This allows for symbolic coverage inwards Windows targets out of the box. However, the code is written inwards a way that Linux enlightenment tin easily hold out added.
Without whatever enlightment, whatever OS that boots tin still hold out fuzzed together with basic coverage tin hold out gathered.
Before reporting OS back upwards issues delight validate that the number is inwards the hypervisor/changes to Bochs past times trying to kick your target using touchstone prebuilt Bochs amongst no hypervisor. Bochs is non unremarkably used together with tin oftentimes receive got breaking bugs for fifty-fifty mutual things similar booting Linux. Especially amongst the rapid internal changes to CPUID/MSR usages amongst Spectre/Meltdown mitigations going into OSes.
Issues
See the issues page on Github for a listing of issues. I've seeded it amongst a few already. Some of these require to hold out addressed rapidly earlier fuzzing evolution starts.
Building
Build Prereqs
To construct this yous require a few things:
- Recently updated MSVC compiler (Visual Studio 2017)
- Nightly Rust (https://rustup.rs/ , must hold out nightly)
- Python (I used three but two should function too)
- 64-bit cygwin amongst autoconf together with GNU brand packages installed
- Hyper-V installed together with a recent construct of Windows 10
MSVC
Install Visual Studio 2017 together with brand sure as shooting it's updated. We're using around haemorrhage border APIs, headers, together with libraries here.
I was using
cl.exe
version: Microsoft (R) C/C++ Optimizing Compiler Version 19.16.27025.1 for x64
And SDK version 10.0.17763.0
Nightly Rust
Install Rust via https://rustup.rs/. I used
rustc 1.32.0-nightly (b3af09205 2018-12-04)
Make sure as shooting yous install the
x86_64-pc-windows-msvc
toolchain every bit only 64-bit is supported for this project.Make sure as shooting
cargo
is inwards your path. This should hold out the default.Python
Go select direct hold of python https://www.python.org/ together with brand sure as shooting it's inwards your PATH such that
python
tin hold out invoked.Cygwin
Install 64-bit Cygwin (https://www.cygwin.com/setup-x86_64.exe) specifically to
C:\cygwin64
. When installing Cygwin brand sure as shooting yous install the autoconf
together with make
packages.Hyper-V
Go into "Turn Windows features on or off" together with tick the checkbox adjacent to "Hyper-V" together with "Windows Hypervisor Platform". This requires of course of study that your figurer supports Hyper-V.
Step-by-step construct process
This install procedure guide was verified on the following:
Clean install of Windows 10, Build 17763 rustc 1.33.0-nightly (8e2063d02 2019-01-07) Microsoft (R) C/C++ Optimizing Compiler Version 19.16.27025.1 for x64 Visual Studio Community 2017 version 15.9.4 applepie commit `f84c084feb487e2e7f31f9052a4ab0addd2c4cf9` Python 3.7.2 x64 git version 2.20.1.windows.1
- Make sure as shooting Windows 10 is fully upwards to appointment
- We role around haemorrhage border features amongst WHVP together with only latest Windows 10 is tested
- In "Turn Windows features on or off"
- Tick "Hyper-V"
- Tick "Windows Hypervisor Platform"
- Click ok to install together with reboot
- Install VS Community 2017 together with updated
- Desktop evolution amongst C++
- Install Rust nightly for x86_64-pc-windows-msvc
- Install Git
- Configure git to checkout as-is, commit unix-style
- If git converts on checkout the ./configure script volition neglect for Bochs due to CRLF trouble endings
- This is core.autocrlf=input
- You tin also role checkout as-is, commit as-is
- This is core.autocrlf=false
- Install Cygwin x64 via setup-x86_64.exe
- Install to "C:\cygwin64"
- Install autoconf parcel (
autoconf
package) - Install GNU brand (
make
package)
- Install Python
- I installed Python three x64 together with added to PATH
- Python two together with 32-bit versions should hold out fine, nosotros merely role Python for our construct script
- Open a "x64 Native Tools Command Prompt for VS 2017"
- Checkout applepie via
git clone https://github.com/gamozolabs/applepie
- cd into applepie
- Run
python build.py
- This volition start cheque for around basic organization requirements
- It volition construct the Rust bochservisor DLL
- It volition so configure Bochs via autoconf
- It volition so construct Bochs amongst GNU brand from Cygwin
Actually Building
Just run
python build.py
from the root directory of this project. It should cheque for sanity of the surround together with everything should "just work".Cleaning
Run
python build.py clean
to create clean Bochs together with Rust binaries.Run
python build.py deepclean
to completely withdraw all Bochs together with Rust binaries, it also removes all the configuration for Bochs. Use this if yous reconfigure Bochs inwards around way.Usage
Read upwards on Bochs configuration to figure out how to laid upwards your environment. We receive got a few requirements, similar
sync=none
, ips=1000000
, together with currently unmarried processor back upwards only. These are enforced within of the code itself to brand sure as shooting yous don't shoot yourself inwards the foot.Use the included
bochservisor_test\bochsrc.bxrc
together with bochservisor_test_real\bochsrc.bxrc
configurations every bit examples. bochservisor_test_real
is probable the most upwards to appointment config yous should facial expression at every bit reference.Coverage
Windows targets receive got module listing enlightenment, which allows us to come across the listings for all the modules inwards the context nosotros are running in. With this nosotros tin convert the educational activity addresses to module + offset. This module + offset helps proceed coverage information betwixt fuzz cases where ASLR solid ground changes. It also allows for the module to hold out colored inwards a tool similar IDA to visually come across what code has been hit.
For Windows targets, symbols volition hold out dynamically downloaded from the symbol shop using your
_NT_SYMBOL_PATH
together with using symchk
. Without symchk
inwards the path it volition silently fail. With symbols a prissy human-readable version of coverage tin hold out saved for viewing. Further, amongst somebody symbols the coverage tin hold out converted to source:line such that source code tin hold out colored.Tests
Okay at that topographic point aren't actually tests, but there's
bochservisor_test
which is a tiny OS that merely verifies that everything boots amongst the hypervisor.There's so
bochservisor_test_real
which is a configuration I role for things similar Windows/Linux. This is the i that volition belike larn updated most frequently.Architecture
Basics
This codebase introduces a modest amount of code to Bochs to allow modular access to CPU context, invitee physical to their backing memory, together with stepping both device together with CPU state.
The principal code yous desire to facial expression at is inwards
lib.rs
inwards the bochservisor
Rust project.CPU Loop
In the principal CPU loop of Bochs nosotros instead
LoadLibrary()
to charge the bochservisor
DLL. This DLL exports i routine which is the Rust CPU loop which volition hold out invoked.Bochs volition move past times a construction to this
bochs_cpu_loop
routine which volition comprise component subdivision pointers to larn information from Bochs together with to measuring the device together with CPU solid ground inwards it.MMIO / I/O
When MMIO or I/O occurs, the hypervisor volition leave of absence amongst a retention mistake or an I/O educational activity fault. While WHVP does render an emulation API it's actually lacking together with non sufficient.
Rather nosotros role Bochs which is already at that topographic point together with measuring through a few instructions. By keeping the hypervisor CPU solid ground inwards sync amongst Bochs nosotros tin dynamically switch betwixt hypervisor together with emulation at whatever fourth dimension (or at to the lowest degree nosotros should hold out able to).
This way that the total hypervisor solid ground is ever inwards sync amongst Bochs together with thence things similar Bochs snapshots should function every bit normal together with could hold out booted without the hypervisor (except peradventure around CPUID solid ground which needs to hold out stored inwards the snapshot info).
When MMIO or I/O occurs nosotros run a sure as shooting number of instructions nether emulation rather than merely emulating one. Due to the API costs of entering together with exiting the hypervisor, together with the likelihood that similar MMIO operations occur adjacent to others, nosotros measuring a few instructions. This allows role to trim the overhead of the API together with reduces the VMEXIT frequency. This is a tunable number but what is inwards the codebase is probable at that topographic point for a reason.
Interrupts
Interrupts nosotros handgrip inwards a actually interesting way. Rather than scheduling interrupts to hold out delivered to the hypervisor nosotros handgrip all interrupts inwards Bochs emulation itself. Things similar exceptions that move on within of the hypervisor alone of course of study are non handled past times Bochs.
This also gives us features that WHVP doesn't support, similar SMIs (for SMM). Bochs's BIOS uses SMM past times default together with without SMI back upwards a custom BIOS needs to hold out built. I did this inwards my start iteration of this... exercise non recommend.
Future
This projection is designed for fuzzing, even so it's so novel (only a few days old) that it has none of these features.
Some of the start things to come upwards volition be:
Evaluate threading
We could potentially receive got Bochs device materials running inwards i thread inwards a loop inwards real-time, together with around other thread running the hypervisor. Async events would hold out communicated via IPC together with would allow for the devices to hold out updated spell execution is inwards the guest.
Currently everything happens inwards i thread which way the hypervisor must leave of absence on an interval to brand sure as shooting nosotros tin measuring devices. It's every bit if nosotros wrote our ain scheduler.
This powerfulness hold out a flake faster, but it also increases complexity together with adds the potential for race issues. It's difficult to say if this volition ever happen.
Code coverage
I'm non sure as shooting which method I'll role to assemble code coverage, but at that topographic point volition hold out at to the lowest degree a few options. Spanning from accurate, to fast, etc. All these coverage mechanisms volition hold out organization marking together with volition non require source or symbols of targets.
Guest enlightenment
Parsing of OS structures to larn primitive information such every bit procedure listings, module lists, etc. This would so hold out used to inquiry PDBs to larn symbol information.
Crash reporting
Reporting crashes inwards around meaningful way. Ideally minidumps would hold out prissy every bit they could hold out loaded upwards together with processed inwards WinDbg. This powerfulness hold out fairly slow every bit DMPs are merely physical retention together with processor context, which nosotros already have.
Crash deduping / root causing
I've got around fun techniques for root causing bugs which receive got been historically successful. I programme to convey those here.
Fast resets
By tracking muddied pages together with restoring only modified things nosotros should hold out able to reset VMs really quickly. This gives us the powerfulness to fuzz at maximum speeds on all cores of a organization target. This is similar to what I did inwards falkervisor so it's already idea out together with designed. It merely needs to hold out ported here.
falkervisor mode
Extremely fast fuzzing that cancels execution when MMIO or I/O occurs. This allows all the CPU fourth dimension to hold out spent inwards the hypervisor together with no emulation time. This has a downside of non supporting things similar disk I/O during a fuzz case, but it's nice.
Philosophy
Some of the core concepts of this projection are absolute minimum modifications to Bochs. This allows us to proceed the Bochs part of this repo upwards to date.
The destination is to also movement every bit much code into Rust together with dlls every bit possible to brand the organization much to a greater extent than modular together with safe. This volition hopefully trim the chances of making empty-headed corruption bugs inwards the hypervisor itself, causing invalid fuzz results.
Currently the hypervisor is a DLL together with tin hold out swapped out without changes to Bochs (unless the FFI API changes).
Further changes to Bochs itself must hold out documented clearly, together with I'll hold out making a document for that shortly to rail the changes to Bochs which must hold out ported together with re-evaluated amongst Bochs updates.