Timeline Analysis, and Program Execution
I mentioned previously that I've been preparing for an upcoming Timeline Analysis course offered through my employer. As part of that preparation, I've been using the tools to walk through the course materials, and in particular one of the hands-on exercises that we will be doing in the course.
One of the things I'd mentioned in my previous post is that Rob Lee has done a great deal of work for SANS, particularly in providing an Excel macro to add color-coding of different events to log2timeline output files. I've had a number of conversations and exchanges with Corey Harrell and others (but mostly Corey) regarding event categorization, and the value of adding these categories to a timeline in order to facilitate analysis. This can be particularly useful when working with Windows Event Log data, as there a good number of events recorded by default, and all of that information can be confusing if you don't have a quick visual reference.
As I was running through the exercises, I noticed something very interesting in the timeline with respect to the use of the Autoruns tool from SysInternals; specifically, that there were a good number of artifacts associated with both the download and use of the tool. I wanted to extract just those artifacts directly associated with Autoruns from the timeline events file, in order to demonstrate how a timeline can illustrate indications of program execution. To do so, I ran the following command:
...and then to get my timeline...
...and got the following:
What I find most interesting about this timeline excerpt is that it illustrates a good deal of interaction with respect to the download and launch of the tool within it's eco-system, clearly demonstrating Locard's Exchange Principle. Now, there are also a number of things that you don't see...for example, this timeline is comprised solely of those lines that included the word "autoruns" (irrespective of case) somewhere in the line; as such, we won't see things such as the query to the "Image File Execution Options" key, to determine if there's been a debugger assigned to the tool, nor do you see ancillary events or those that might be encoded. However, what we do see will clearly allow us to "zoom in" on a specific time window within the overall timeline, and see what other events may be listed there.
The timeline is clearly very illustrative. We can see the download of the tool (in this case, via Chrome to a Windows 7 platform), and the assignment of the ":Zone.Identifier" ADSs, something that with XP SP2 was done only via IE and Outlook. Beyond the file system metadata, we start to see even more context, simply by adding additional data sources such as the Registry AppCompatCache value data, UserAssist value data, information derived from the SysInternals key in the user's Registry hive, Jump Lists, etc. In this case, the Jump List info in the timeline was extracted from the DestList stream found in the Jump List for the Windows Explorer shell, as zipped archives will often be treated as if they were folders.
Another valuable aspect of this sort of timeline data is that it is very useful in the face of the use of counter-forensics techniques, even those that may be unintentional (i.e., performed by an administrator, not to hide data, but to "clean up" the system). Let's say that this tool had been run, and then deleted; remove all of the "FILE" entries that point to C:/tools from the above timeline, and what do you have left? You have those artifacts that persist beyond the deletion of files and programs, and provide clear indicators that the tools had been used. We can apply this same sort of analysis to other situations where tools had been run (programs executed) on a system, and then some steps taken to obviate or hide the data.
M... [Program Execution] AppCompatCache - C:\tools\autorunsc.exe
The "M..." refers to the fact that, as pointed out by Mandiant, when the tool is run, the file modification time for the tool is recorded in the data structure within the AppCompatCache value. The "[Program Execution]" category identifier, in this case, indicates that the CSRSS flag was set (you'll need to read Mandiant's white paper). The existence of the application prefetch file for the tool, as well as the UserAssist entry, help illustrate that the program had been executed.
One of the unique things about the SysInternals tools is that after they were taken over by Microsoft, they began to have EULA acceptance dialogs added to them. Now, there is a command line switch that you can use to run the CLI versions of the tools and accept the EULA, but the tools will create their own subkey beneath the SysInternals key in the Software hive, and set the "EulaAccepted" value. Even if the tool is renamed, these same artifacts will be left on a system.
File system metadata was extracted from the acquired image using TSK fls.exe. As such, we know that the MACB times are from the $STANDARD_INFORMATION attribute within the MFT, which are highly mutable; that is to say, easily modified to arbitrary values. We can see from the timeline that Autoruns.zip was downloaded on 15 May, and according to the SysInternals web site, an updated version of the tool was posted on 14 May. The files were extracted from the zipped archive, carrying with them some of their original file times, which is why we see ".A.B" times prior to the date that the archive was downloaded. Had the file times been modified to arbitrary values (i.e., "stomped"), rather than the files being deleted, we would still see the other artifacts listed in the timeline, in that order. In essence, we'd have a "signature" for program execution.
Other sources of data that would not appear in a timeline can include, for example, the user's MUICache key. This key simply holds a list of values, and in a number of exams, I've found references to malware that was run on the system, even after the actual files had been removed. Also, if the AutoRuns files had been deleted, I could parse the AutoRuns.lnk Windows shortcut file to get the path to, as well as the MA.B times for, the target file. In order to illustrate that, what follows is the raw output of an LNK file/stream parser:
atime Tue May 15 21:11:59 2012
basepath C:\Users\
birth_obj_id_node 08:00:27:dd:64:d1
birth_obj_id_seq 9270
birth_obj_id_time Tue May 15 21:09:27 2012
birth_vol_id 2C645C57D81C5047B7DDE13C2834AAD2
commonpathsuffix john\Downloads\Autoruns.zip
ctime Tue May 15 21:11:59 2012
filesize 535772
machineID john-pc
mtime Tue May 15 21:11:59 2012
netname \\JOHN-PC\Users
new_obj_id_node 08:00:27:dd:64:d1
new_obj_id_seq 9270
new_obj_id_time Tue May 15 21:09:27 2012
new_vol_id 2C645C57D81C5047B7DDE13C2834AAD2
relativepath ..\..\..\..\..\Downloads\Autoruns.zip
vol_sn F405-DAC1
vol_type Fixed Disk
The "mtime","atime", and "ctime" values correspond to the MA.B times, respectively, of the target file, which in this case is the Autoruns.zip archive. As such, I could either go back and add the LNK info to my timeline, or automatically have that information added during the initial process of collecting data for the timeline. In this case, what I would expect to see would be MA.B times from both the file system and the LNK file metadata at exactly the same time. Remember, the absence of an artifact where we expect to find one is itself an artifact, and as such, if the Autoruns.zip file system metadata was not available, that would tell me something and perhaps take my analysis in another direction.
[Note: I know you're looking at the above output and thinking, "wow, that looks like a MAC address in the output!" You're right, it is. In this case, looking up the OUI leads us to Cadmus Systems, and yes, the system was from a VM running in VirtualBox. Also, there's a good deal of additional information available in the LNK file metadata, to include the fact that the target file was on a fixed disk, as opposed to a removable or network drive.]
The Value of Multiple Data Sources
Regarding the value of data from multiple sources (even additional locations within the same source, in a comment to his post regarding a RegRipper plugin that he'd written, Jason Hale points out, quite correctly:
I didn't think there was a whole lot of value in the information from the TypedURLsTime key itself (other than knowing that computer activity was occurring at that time) without correlating it with the values in TypedURLs.
Jason actually wrote more than one plugin to extract the TypedURLsTime value data (this key is specific to Windows 8 systems). I've looked at the plugin that outputs in TLN format, for inclusion in a timeline...I use a different source identifier in version I wrote (I use "REG", for consistency...Jason uses "NTUSER.DAT"). However, we both reached point B, albeit via different routes. This will definitely be something I'll be including in my Windows 8 exams.
Key Concepts
1. Employing multiple data sources to develop a timeline of system activity provides context, as well as increases our relative confidence in the data itself.
2. Employing multiple data sources can demonstrate program execution.
3. Employing multiple data sources can illustrate and overcome the use of counter-forensics activities, however unintentional those activities may be.
One of the things I'd mentioned in my previous post is that Rob Lee has done a great deal of work for SANS, particularly in providing an Excel macro to add color-coding of different events to log2timeline output files. I've had a number of conversations and exchanges with Corey Harrell and others (but mostly Corey) regarding event categorization, and the value of adding these categories to a timeline in order to facilitate analysis. This can be particularly useful when working with Windows Event Log data, as there a good number of events recorded by default, and all of that information can be confusing if you don't have a quick visual reference.
As I was running through the exercises, I noticed something very interesting in the timeline with respect to the use of the Autoruns tool from SysInternals; specifically, that there were a good number of artifacts associated with both the download and use of the tool. I wanted to extract just those artifacts directly associated with Autoruns from the timeline events file, in order to demonstrate how a timeline can illustrate indications of program execution. To do so, I ran the following command:
type events.txt | find "autoruns" /i > autoruns_events.txt
...and then to get my timeline...
parse -f autoruns_events.txt > autoruns_tln.txt
...and got the following:
Tue May 29 12:56:02 2012 Z
FILE - ..C. [195166] C:/Windows/Prefetch/AUTORUNS.EXE-1CF578DD.pf
FILE - ..C. [44056] C:/Windows/Prefetch/AUTORUNSC.EXE-C5802224.pf
Tue May 15 21:14:55 2012 Z
REG johns-pc john - M... HKCU/Software/Sysinternals/AutoRuns
REG johns-pc john - [Program Execution] Software\SysInternals\AutoRuns (EulaAccepted)
Tue May 15 21:14:07 2012 Z
FILE - MA.B [195166] C:/Windows/Prefetch/AUTORUNS.EXE-1CF578DD.pf
Tue May 15 21:13:57 2012 Z
PREF johns-PC - [Program Execution] AUTORUNS.EXE-1CF578DD.pf last run (1)
REG johns-pc john - [Program Execution] UserAssist - C:\tools\autoruns.exe (1)
Tue May 15 21:13:53 2012 Z
FILE - M.C. [640632] C:/tools/autoruns.exe
FILE - M.C. [26] C:/tools/autoruns.exe:Zone.Identifier
REG johns-pc - M... [Program Execution] AppCompatCache - C:\tools\autoruns.exe
Tue May 15 21:13:42 2012 Z
FILE - MAC. [877] C:/Users/john/AppData/Roaming/Microsoft/Windows/Recent/Autoruns.lnk
JumpList johns-pc john - C:\Users\john\Downloads\Autoruns.zip
Tue May 15 21:13:32 2012 Z
FILE - MA.B [44056] C:/Windows/Prefetch/AUTORUNSC.EXE-C5802224.pf
Tue May 15 21:13:28 2012 Z
PREF johns-PC - [Program Execution] AUTORUNSC.EXE-C5802224.pf last run (1)
REG johns-pc john - [Program Execution] UserAssist - C:\tools\autorunsc.exe (1)
Tue May 15 21:13:23 2012 Z
FILE - M.C. [49648] C:/tools/autoruns.chm
FILE - M.C. [26] C:/tools/autoruns.chm:Zone.Identifier
FILE - M.C. [559736] C:/tools/autorunsc.exe
FILE - M.C. [26] C:/tools/autorunsc.exe:Zone.Identifier
REG johns-pc - M... [Program Execution] AppCompatCache - C:\tools\autorunsc.exe
Tue May 15 21:12:10 2012 Z
FILE - ...B [877] C:/Users/john/AppData/Roaming/Microsoft/Windows/Recent/Autoruns.lnk
FILE - ..C. [535772] C:/Users/john/Downloads/Autoruns.zip
FILE - ..C. [26] C:/Users/john/Downloads/Autoruns.zip:Zone.Identifier
Tue May 15 21:11:59 2012 Z
FILE - MA.B [535772] C:/Users/john/Downloads/Autoruns.zip
FILE - MA.B [26] C:/Users/john/Downloads/Autoruns.zip:Zone.Identifier
Wed May 9 15:08:16 2012 Z
FILE - .A.B [640632] C:/tools/autoruns.exe
FILE - .A.B [26] C:/tools/autoruns.exe:Zone.Identifier
FILE - .A.B [559736] C:/tools/autorunsc.exe
FILE - .A.B [26] C:/tools/autorunsc.exe:Zone.Identifier
Sat Nov 5 17:52:32 2011 Z
FILE - .A.B [49648] C:/tools/autoruns.chm
FILE - .A.B [26] C:/tools/autoruns.chm:Zone.Identifier
What I find most interesting about this timeline excerpt is that it illustrates a good deal of interaction with respect to the download and launch of the tool within it's eco-system, clearly demonstrating Locard's Exchange Principle. Now, there are also a number of things that you don't see...for example, this timeline is comprised solely of those lines that included the word "autoruns" (irrespective of case) somewhere in the line; as such, we won't see things such as the query to the "Image File Execution Options" key, to determine if there's been a debugger assigned to the tool, nor do you see ancillary events or those that might be encoded. However, what we do see will clearly allow us to "zoom in" on a specific time window within the overall timeline, and see what other events may be listed there.
The timeline is clearly very illustrative. We can see the download of the tool (in this case, via Chrome to a Windows 7 platform), and the assignment of the ":Zone.Identifier" ADSs, something that with XP SP2 was done only via IE and Outlook. Beyond the file system metadata, we start to see even more context, simply by adding additional data sources such as the Registry AppCompatCache value data, UserAssist value data, information derived from the SysInternals key in the user's Registry hive, Jump Lists, etc. In this case, the Jump List info in the timeline was extracted from the DestList stream found in the Jump List for the Windows Explorer shell, as zipped archives will often be treated as if they were folders.
Another valuable aspect of this sort of timeline data is that it is very useful in the face of the use of counter-forensics techniques, even those that may be unintentional (i.e., performed by an administrator, not to hide data, but to "clean up" the system). Let's say that this tool had been run, and then deleted; remove all of the "FILE" entries that point to C:/tools from the above timeline, and what do you have left? You have those artifacts that persist beyond the deletion of files and programs, and provide clear indicators that the tools had been used. We can apply this same sort of analysis to other situations where tools had been run (programs executed) on a system, and then some steps taken to obviate or hide the data.
M... [Program Execution] AppCompatCache - C:\tools\autorunsc.exe
The "M..." refers to the fact that, as pointed out by Mandiant, when the tool is run, the file modification time for the tool is recorded in the data structure within the AppCompatCache value. The "[Program Execution]" category identifier, in this case, indicates that the CSRSS flag was set (you'll need to read Mandiant's white paper). The existence of the application prefetch file for the tool, as well as the UserAssist entry, help illustrate that the program had been executed.
One of the unique things about the SysInternals tools is that after they were taken over by Microsoft, they began to have EULA acceptance dialogs added to them. Now, there is a command line switch that you can use to run the CLI versions of the tools and accept the EULA, but the tools will create their own subkey beneath the SysInternals key in the Software hive, and set the "EulaAccepted" value. Even if the tool is renamed, these same artifacts will be left on a system.
File system metadata was extracted from the acquired image using TSK fls.exe. As such, we know that the MACB times are from the $STANDARD_INFORMATION attribute within the MFT, which are highly mutable; that is to say, easily modified to arbitrary values. We can see from the timeline that Autoruns.zip was downloaded on 15 May, and according to the SysInternals web site, an updated version of the tool was posted on 14 May. The files were extracted from the zipped archive, carrying with them some of their original file times, which is why we see ".A.B" times prior to the date that the archive was downloaded. Had the file times been modified to arbitrary values (i.e., "stomped"), rather than the files being deleted, we would still see the other artifacts listed in the timeline, in that order. In essence, we'd have a "signature" for program execution.
Other sources of data that would not appear in a timeline can include, for example, the user's MUICache key. This key simply holds a list of values, and in a number of exams, I've found references to malware that was run on the system, even after the actual files had been removed. Also, if the AutoRuns files had been deleted, I could parse the AutoRuns.lnk Windows shortcut file to get the path to, as well as the MA.B times for, the target file. In order to illustrate that, what follows is the raw output of an LNK file/stream parser:
atime Tue May 15 21:11:59 2012
basepath C:\Users\
birth_obj_id_node 08:00:27:dd:64:d1
birth_obj_id_seq 9270
birth_obj_id_time Tue May 15 21:09:27 2012
birth_vol_id 2C645C57D81C5047B7DDE13C2834AAD2
commonpathsuffix john\Downloads\Autoruns.zip
ctime Tue May 15 21:11:59 2012
filesize 535772
machineID john-pc
mtime Tue May 15 21:11:59 2012
netname \\JOHN-PC\Users
new_obj_id_node 08:00:27:dd:64:d1
new_obj_id_seq 9270
new_obj_id_time Tue May 15 21:09:27 2012
new_vol_id 2C645C57D81C5047B7DDE13C2834AAD2
relativepath ..\..\..\..\..\Downloads\Autoruns.zip
vol_sn F405-DAC1
vol_type Fixed Disk
The "mtime","atime", and "ctime" values correspond to the MA.B times, respectively, of the target file, which in this case is the Autoruns.zip archive. As such, I could either go back and add the LNK info to my timeline, or automatically have that information added during the initial process of collecting data for the timeline. In this case, what I would expect to see would be MA.B times from both the file system and the LNK file metadata at exactly the same time. Remember, the absence of an artifact where we expect to find one is itself an artifact, and as such, if the Autoruns.zip file system metadata was not available, that would tell me something and perhaps take my analysis in another direction.
[Note: I know you're looking at the above output and thinking, "wow, that looks like a MAC address in the output!" You're right, it is. In this case, looking up the OUI leads us to Cadmus Systems, and yes, the system was from a VM running in VirtualBox. Also, there's a good deal of additional information available in the LNK file metadata, to include the fact that the target file was on a fixed disk, as opposed to a removable or network drive.]
The Value of Multiple Data Sources
Regarding the value of data from multiple sources (even additional locations within the same source, in a comment to his post regarding a RegRipper plugin that he'd written, Jason Hale points out, quite correctly:
I didn't think there was a whole lot of value in the information from the TypedURLsTime key itself (other than knowing that computer activity was occurring at that time) without correlating it with the values in TypedURLs.
Jason actually wrote more than one plugin to extract the TypedURLsTime value data (this key is specific to Windows 8 systems). I've looked at the plugin that outputs in TLN format, for inclusion in a timeline...I use a different source identifier in version I wrote (I use "REG", for consistency...Jason uses "NTUSER.DAT"). However, we both reached point B, albeit via different routes. This will definitely be something I'll be including in my Windows 8 exams.
Key Concepts
1. Employing multiple data sources to develop a timeline of system activity provides context, as well as increases our relative confidence in the data itself.
2. Employing multiple data sources can demonstrate program execution.
3. Employing multiple data sources can illustrate and overcome the use of counter-forensics activities, however unintentional those activities may be.