Extending RegRipper (aka, "Forensic Scanner")

I'll be presenting on "Extending RegRipper" at Brian Carrier's Open Source Digital Forensics Conference on 14 June, along with Cory Altheide, and I wanted to provide a bit of background with regards to what my presentation will cover...

In '98-'99, I was working for Trident Data Systems, Inc., (TDS) conducting vulnerability assessments for organizations.  One of the things we did as part of this work was run ISS’s Internet Scanner (now owned by IBM) against the infrastructure; either a full, broad-brush scan or just very specific segments, depending upon the needs and wants of the organization.  I became very interested in how the scanner worked, and began to note differences in how the scanner would report its findings based on the level of access we had to the systems within the infrastructure.  Something else I noticed was that many of the checks that were scanned were a result of the ISS X-Force vulnerability discovery team.  In short, a couple of very smart folks would discover a vulnerability, add a means of scanning for that vulnerability via the Internet Scanner framework, and roll it out to thousands of customers.  Within fairly short order, this check can be rolled out to hundreds or thousands of analysts, none of whom have any prior knowledge of the vulnerability, nor have had to invest the time to investigate it.  This became even more clear as I started to create an open-source (albeit proprietary) scanner to replace the use of Internet Scanner, due in large part to significant issues with inaccurate checks, and the need to adapt the output.  I could create a check to be run, and give it to an analyst going on-site, and they wouldn't need to have any prior knowledge of the issue, nor would they have to invest time in discovery and analysis, but they could run the check and easily review and understand the results.

Other aspects of information security also benefit from the use of scanners.  Penetration testing and web application assessments benefit from scanners that include frameworks for providing new and updated checks to be run, and many of the analysts running the scanners have no prior knowledge of the checks that are being run.  Nessus (from Tenable) is a very good example of this sort of scanner; the plugins run by the scanner are text-based, providing instructions for the scanner.  These plugins are easy to open and read, and provide a great deal of information regarding how the checks are constructed and run.


Given all of the benefits derived from scanners in other disciplines within information security, it just stands to reason that digital forensic analysis would also benefit from a similar framework.

The forensic scanner is not intended to replace the analyst; rather, it is intended as a framework for documenting and retaining the institutional knowledge of all analysts on the team, and remove the tedium of looking for that "low-hanging fruit" that likely exists in most, if not all, exams.

A number of commercially available forensic analysis applications (EnCase, ProDiscover) have scripting languages and scanner-like functionality; however, in most cases, this functionality is based on proprietary APIs, and in some cases, scripting languages (ProDiscover uses Perl as it's scripting language, but the API for accessing the data is unique to the application). 

A scanner framework is not meant to replace the use of commercial forensic analysis applications; rather, the scanner framework would augment and enhance the use of those applications, by providing an easy and efficient means for educating new analysts, as well as "sweeping up" the "low-hanging fruit", leaving the deeper analysis for the more experienced analysts.

This scanner framework would be based on easily available tools and techniques.  For example, the scanner would be designed to access acquired images mounted read-only via the operating system (Linux mount command) or via freely available applications (Windows - FTK Imager v3.0, ImDisk, vhd/vmdk, etc.); that way, the scanner can make use of currently available APIs (via Perl, Python, etc.) in order to access data within the acquired image, and do so in a "forensically sound manner" (i.e., not making any changes to the original data).

The scanner is not intended to run in isolation; rather, it is intended to be used with other tools (here, here) as part of an overall process.  The purpose of the scanner is to provide a means for retention, efficient deployment, and proliferation of institutional digital forensic knowledge.

Benefits
Some benefits of a forensic scanner framework such as this include, but are not limited to, the following:

1.  Knowledge Retention - None of us knows everything, and we all see new things during examinations.  When an analyst sees or discovers something new, a plugin can be written or updated.  Once this is done, that knowledge exists, regardless of the state of the analyst (she goes on vacation, leaves for another position, etc.).  Enforcing best practice documentation of the plugin ensures that as much knowledge as possible is retained along with the application, providing an excellent educational tool, as well as a ready means for adapting or improving the plugin.

2.  Establish a career progression - When new folks are brought aboard a team, they have to start somewhere.  In most cases, particularly with consulting organizations, skilled/experienced analysts are hired, but as the industry develops, this won't always be the case.  The forensic scanner provides an ancillary framework for developing "home grown" expertise where inexperienced analysts are hired.  Starting the new analysts off in a lab environment and having them begin learning the necessary procedures by acquiring and verifying media puts them in an excellent position to run the scanner.  For example, the analyst either goes on-site and conducts acquisition, or acquires media sent to the lab, and prepares the necessary documentation.  Then, they mount the acquired image and run the scanner, providing the more experienced analyst with the path to the acquired image and the report.

This framework also provides an objective means for personnel assessment; managers can easily track the plugins that are improved or developed by various analysts.

3.  Teamwork - In many environments, development of plugins likely will not occur in a vacuum or in isolation.  Plugins need to be reviewed, and can be improved based on the experience of other analysts.  For example, let's say an analyst runs across a Zeus infection and decides to write a plugin for the artifacts.  When the plugin is reviewed, another analyst mentions that Zeus will load differently based on the permissions of the user upon infection.  The plugin can them be documented and modified to include additional conditions.

New plugins can be introduced and discussed during team meetings or through virtual conferences and collaboration, but regardless of the method, it introduces a very important aspect of forensic analysis...peer review.

4.  Ease of modification - One size does not fit all.  There are times when analysts will not be working with full images, but instead will only have access to selected files from systems.  A properly constructed framework will provide the means necessary for accessing and scanning these limited data sets, as well.  Also, reporting of the scanner can be modified according to the needs of the analyst, or organization.

5.  Flexibility - A scanner framework is not limited to just acquired images.  For example, F-Response provides a means of access to live, remote systems in a manner that is similar to an acquired image (i.e., much of the same API can be used, as with RegRipper), so the framework used to access images can also be used against systems accessed via F-Response.  As the images themselves would be mounted read-only in order to be scanned, Volume Shadow Copies could also be mounted and scanned using the same scanner and same plugins.

Another means of flexibility comes about through the use of "idle" resources.  What I mean by that is that many times, analysts working on-site or actively engaged in analysis may be extremely busy, so running the scanner and providing the output to another, off-site analyst who is not actively engaged frees up the on-site team and provides answers/solutions in a timely and efficient manner.  Or, data can be provided and the off-site analyst can write a plugin based on that data, and that plugin can be run against all other systems/images.  In these instances, entire images do not have to be sent to the off-site analyst, as this takes considerable time and can expose sensitive data.  Instead, only very specific data is sent, making for a much smaller data set (KB as opposed to GB).