TimeLine Analysis
Note to the reader, and to self: This topic is more than likely going to be spread out over several posts, as the information develops...
As I've been working on the second edition of Windows Forensic Analysis, as well as working my own engagements and assisting with others, I've been doing some serious thinking about timeline analysis. Well, to be honest, I'm not so much bringing this up myself as sort of adding to what Michael Cloppert posted on the SANS Forensics Blog. Michael wrote the ex-tip tool and a really great paper about it (basic concept, design, etc.), and I've been doing some thinking about this very same subject over the past couple of weeks/months, particularly after exchanging some emails with Michael and Brian Carrier. Apparently, others have been thinking about this subject, as well, including our friend HogFly (see his excellent Footprints in the Snow post).
A lot of my thinking along these lines started with Brian Carrier's TSK tools (also available for Windows) and the fls tool he wrote which created a body file from the file system in the image. I had thought about writing a tool or adding the capability to RegRipper to report data in a format that could be easily added or appended to a body file and then included in the parsing process used by mactime or ex-tip. Mike included this sort of capability as a plugin to ex-tip, but it simply retrieves the LastWrite times of all of the keys and doesn't provide the level of context or data reduction available through something like RegRipper (or rip.exe). Like Mike, I also started looking at other sources of data, including parsing Event Logs with evt2xls and incorporating that information directly into timeline analysis.
However, it occurs to me that a couple of modifications to the current processes for collecting and parsing the timeline data can lead to improvements in the overall process, including presentation and reporting of the collected data, taking it beyond mactime's text-based output.
For example, what defines an event in a timeline? At the least, if you're only looking at data from a single system, you need a point in time (say, normalized to the Unix epoch as a means of standardization) and a description of the event. If you move to incorporating additional sources (network traffic captures) as well as data from additional systems, you need to include source (i.e., Registry, Event Log, AV logs, etc.) and host (system NetBIOS or DNS name, IP address, MAC address, etc.) information, and possibly even user (username, email address, SID, etc.) information.
Taking this a step further, for the purposes of data reduction, you might want to define a span event, as opposed to a point in time...rather than listing a tremendous number of file access events due to an AV scan, why not simply define the AV scan as a span event, and remove the various points? Or, parsing the Event Logs, you can define a span as the time that a system was running or that a user was logged in.
So at this point, an event structure might look something like this:
Type - Point or span; can be represented as a binary (1 or 0) value
Time - MS systems use 64-bit FILETIME objects in many cases; however, for the purposes of normalization, 32-but Unix epoch times will work just fine
Source - fixed-length field for the source of the data (i.e., file system, Registry, EVT/EVTX file, AV or application log file, etc.) - (may require a key or legend)
Host - The host system, defined by IP or MAC address, NetBIOS or DNS name, etc. (may require a key or legend)
User - User, defined by user name, SID, email address, IM screenname, etc. (may require a key or legend)
Description - The description of what happened; this is where context comes in...
I think you can see how this opens things up a bit to allow for other sources of data. Not all of the fields in the structure need be present; again, a time and a description are enough to define an "event" for a timeline.
What about representation and reporting of the data? Text-based or even a spreadsheet might be nice for some data representation, but something graphical may be more appropriate when working with larger data sets. Presentation means such as Zeitline, EasyTimeline, and Simile Timeline (Jerome has some additional information on manually adding events to Simile) are available, each with their own strengths and weaknesses. However, I've found that for both analysis and presentation to an end user (i.e., customer, etc.) a graphical approach can be very useful.
As I've been working on the second edition of Windows Forensic Analysis, as well as working my own engagements and assisting with others, I've been doing some serious thinking about timeline analysis. Well, to be honest, I'm not so much bringing this up myself as sort of adding to what Michael Cloppert posted on the SANS Forensics Blog. Michael wrote the ex-tip tool and a really great paper about it (basic concept, design, etc.), and I've been doing some thinking about this very same subject over the past couple of weeks/months, particularly after exchanging some emails with Michael and Brian Carrier. Apparently, others have been thinking about this subject, as well, including our friend HogFly (see his excellent Footprints in the Snow post).
A lot of my thinking along these lines started with Brian Carrier's TSK tools (also available for Windows) and the fls tool he wrote which created a body file from the file system in the image. I had thought about writing a tool or adding the capability to RegRipper to report data in a format that could be easily added or appended to a body file and then included in the parsing process used by mactime or ex-tip. Mike included this sort of capability as a plugin to ex-tip, but it simply retrieves the LastWrite times of all of the keys and doesn't provide the level of context or data reduction available through something like RegRipper (or rip.exe). Like Mike, I also started looking at other sources of data, including parsing Event Logs with evt2xls and incorporating that information directly into timeline analysis.
However, it occurs to me that a couple of modifications to the current processes for collecting and parsing the timeline data can lead to improvements in the overall process, including presentation and reporting of the collected data, taking it beyond mactime's text-based output.
For example, what defines an event in a timeline? At the least, if you're only looking at data from a single system, you need a point in time (say, normalized to the Unix epoch as a means of standardization) and a description of the event. If you move to incorporating additional sources (network traffic captures) as well as data from additional systems, you need to include source (i.e., Registry, Event Log, AV logs, etc.) and host (system NetBIOS or DNS name, IP address, MAC address, etc.) information, and possibly even user (username, email address, SID, etc.) information.
Taking this a step further, for the purposes of data reduction, you might want to define a span event, as opposed to a point in time...rather than listing a tremendous number of file access events due to an AV scan, why not simply define the AV scan as a span event, and remove the various points? Or, parsing the Event Logs, you can define a span as the time that a system was running or that a user was logged in.
So at this point, an event structure might look something like this:
Type - Point or span; can be represented as a binary (1 or 0) value
Time - MS systems use 64-bit FILETIME objects in many cases; however, for the purposes of normalization, 32-but Unix epoch times will work just fine
Source - fixed-length field for the source of the data (i.e., file system, Registry, EVT/EVTX file, AV or application log file, etc.) - (may require a key or legend)
Host - The host system, defined by IP or MAC address, NetBIOS or DNS name, etc. (may require a key or legend)
User - User, defined by user name, SID, email address, IM screenname, etc. (may require a key or legend)
Description - The description of what happened; this is where context comes in...
I think you can see how this opens things up a bit to allow for other sources of data. Not all of the fields in the structure need be present; again, a time and a description are enough to define an "event" for a timeline.
What about representation and reporting of the data? Text-based or even a spreadsheet might be nice for some data representation, but something graphical may be more appropriate when working with larger data sets. Presentation means such as Zeitline, EasyTimeline, and Simile Timeline (Jerome has some additional information on manually adding events to Simile) are available, each with their own strengths and weaknesses. However, I've found that for both analysis and presentation to an end user (i.e., customer, etc.) a graphical approach can be very useful.