Stuff
Timelines
Corey Harrell recently posted to his blog about using Calc (the spreadsheet program associated with OpenOffice) to perform timeline analysis. Corey's post was very revealing and thorough, and clearly demonstrates to the reader that here's a guy who does timeline analysis. I mean, Corey looks at regular expressions for searching, etc., and really does a good job of covering a lot of the different analysis aspects that you'd do in Excel, and provides a good indication that he's actually been doing this kind of analysis. Great job, Corey.
However, there are two things that came up in the post that might be good points for discussion. First, Corey says that his exploration originated from an anonymous question asking if Calc could be used in place of Excel. I applaud Corey's efforst, but this question demonstrates how some analysts will continue in asking these kinds of questions, rather than doing their own research and posting the results. This is one of those things that could have been easily stated as "Calc can also be used...", rather than as an anonymous question.
The other issue is the concern that Corey expressed with respect to the spreadsheet program's ability to handle massive numbers (100K or more) of rows. This is definitely a concern (particularly with versions of Excel that were pre-Office 2007), but it also demonstrates how timelines, which are meant to some degree to be a data reduction technique, can actually become very cumbersome and even slow the analyst down, simply by the shear volume of data. Yes, extracting data and generating a timeline is less data than a full image (i.e., a directory listing of a 160GB hard drive is much less than 160GB...), but it appears that so much data is capable of being added to a timeline that doing so could easily overwhelm an analyst.
Timelines and Data Reduction
One solution to this that I highly recommend is an educated, knowledgeable approach to timeline development and analysis, which is something Chris points to in his Sniper Forensics presentation. Rather than throwing everything (even the kitchen sink, if it has a time stamp) into your timeline and sorting it out from there, why not instead start with the goals of your examination, what you're trying to show, and go on from there? Your goals will show you what you need to look at, and what you need to show.
After all, one of the benefits and goals of timelines is...and should be...data reduction. While a timeline compiled from an abundance of data sources is indeed a reduction of data volume, is it really a reduction of data to be analyzed? In many cases, it may not be.
Consider this...let's say that while driving your car, you hear an odd noise, maybe a squeal of some kind, coming from front left tire every time you break. What do you do? Do you disassemble the entire car in hopes of finding something bad, or something that might be responsible for the noise? Or do you do some testing to ensure that the noise is really coming from where you think it is, and under what specific conditions?
No, I'm not saying that we throw everything out and start over. Rather, what I'm suggesting is that rather than throwing everything into a timeline, assess the relative value of the data prior to adding it. One excellent example of this is a SQL injection analysis I did...I started with the file system metadata, and added just the web server logs that contained the pertinent SQL injection statements. There was no need to add Event Log data, nor all of the LastWrite times from all of the Registry keys from the hive files on the system. There was simply no value in doing this; in fact, doing so would have complicated the analysis, simply through shear volume. Is this an extreme example? No, I don't think that it is.
By reducing the data that we need to analyze, even if doing so in through the analysis (determine what we can remove from timeline as "noise", etc.), we then get to a point where relational analysis...that is, analyzing different systems in relation to each other, we can get a better view of what may have occurred (and when) in multi-system incidents/breaches. Remember, in the five field timeline format that I've recommended using, there is a field for the system name, as well as one for the user name. Using these fields, you can observe the use of domain credentials across multiple systems, for example.
Goals
I know that many folks are going to say, "What if you don't know what you're looking for?", and the answer to that is, "If you don't know what you're looking for, why are you doing analysis?" Seriously. If you don't know what the goals of your analysis are, why are you doing it?
Sometimes folks will say, "my goals are to find all bad stuff". Well...what constitutes "bad"? What if you find nmap and Metasploit installed on a system? Is it bad? What if the user is a network engineer tasked with vulnerability scanning and assessment? Then, is this find "bad"?
From the questions I receive, a lot of times I think that there is difficulty in defining goals. Does anyone really want to "find all malware"? Really? So you want to know about every BHO and bit of spyware? Given some home user systems, imagine how long it would take to locate, document, and categorize all malware on a system. Usually what I have found is that "find all malware" really means "find the malware that may have been associated with a particular incident or event". Then, getting the person you're working with to describe the event can help you narrow down the goals of your analysis. After all, they're talking to you for a reason, right? If they could do the analysis themselves, there would be no reason to talk to you. Developing a concise set of goals allows you to define and set expectations, as well as deliver something tangible and useful.
Timeline as a Tool
Timeline analysis is an extremely useful and valuable tool...but like any other tool, it's just a tool. The actual analysis is up to the analyst. There may be times when it simply doesn't make sense to create a timeline...if that's the case, then don't bother. However, if it does make sense to develop a timeline, then do so intelligently.
Volatility Updates
Gleeda recently tweeted about documentation for Volatility being available. If you do any memory forensics, this is an excellent resource that walks you through getting all set up and running with the full capabilities of Volatility.
Corey Harrell recently posted to his blog about using Calc (the spreadsheet program associated with OpenOffice) to perform timeline analysis. Corey's post was very revealing and thorough, and clearly demonstrates to the reader that here's a guy who does timeline analysis. I mean, Corey looks at regular expressions for searching, etc., and really does a good job of covering a lot of the different analysis aspects that you'd do in Excel, and provides a good indication that he's actually been doing this kind of analysis. Great job, Corey.
However, there are two things that came up in the post that might be good points for discussion. First, Corey says that his exploration originated from an anonymous question asking if Calc could be used in place of Excel. I applaud Corey's efforst, but this question demonstrates how some analysts will continue in asking these kinds of questions, rather than doing their own research and posting the results. This is one of those things that could have been easily stated as "Calc can also be used...", rather than as an anonymous question.
The other issue is the concern that Corey expressed with respect to the spreadsheet program's ability to handle massive numbers (100K or more) of rows. This is definitely a concern (particularly with versions of Excel that were pre-Office 2007), but it also demonstrates how timelines, which are meant to some degree to be a data reduction technique, can actually become very cumbersome and even slow the analyst down, simply by the shear volume of data. Yes, extracting data and generating a timeline is less data than a full image (i.e., a directory listing of a 160GB hard drive is much less than 160GB...), but it appears that so much data is capable of being added to a timeline that doing so could easily overwhelm an analyst.
Timelines and Data Reduction
One solution to this that I highly recommend is an educated, knowledgeable approach to timeline development and analysis, which is something Chris points to in his Sniper Forensics presentation. Rather than throwing everything (even the kitchen sink, if it has a time stamp) into your timeline and sorting it out from there, why not instead start with the goals of your examination, what you're trying to show, and go on from there? Your goals will show you what you need to look at, and what you need to show.
After all, one of the benefits and goals of timelines is...and should be...data reduction. While a timeline compiled from an abundance of data sources is indeed a reduction of data volume, is it really a reduction of data to be analyzed? In many cases, it may not be.
Consider this...let's say that while driving your car, you hear an odd noise, maybe a squeal of some kind, coming from front left tire every time you break. What do you do? Do you disassemble the entire car in hopes of finding something bad, or something that might be responsible for the noise? Or do you do some testing to ensure that the noise is really coming from where you think it is, and under what specific conditions?
No, I'm not saying that we throw everything out and start over. Rather, what I'm suggesting is that rather than throwing everything into a timeline, assess the relative value of the data prior to adding it. One excellent example of this is a SQL injection analysis I did...I started with the file system metadata, and added just the web server logs that contained the pertinent SQL injection statements. There was no need to add Event Log data, nor all of the LastWrite times from all of the Registry keys from the hive files on the system. There was simply no value in doing this; in fact, doing so would have complicated the analysis, simply through shear volume. Is this an extreme example? No, I don't think that it is.
By reducing the data that we need to analyze, even if doing so in through the analysis (determine what we can remove from timeline as "noise", etc.), we then get to a point where relational analysis...that is, analyzing different systems in relation to each other, we can get a better view of what may have occurred (and when) in multi-system incidents/breaches. Remember, in the five field timeline format that I've recommended using, there is a field for the system name, as well as one for the user name. Using these fields, you can observe the use of domain credentials across multiple systems, for example.
Goals
I know that many folks are going to say, "What if you don't know what you're looking for?", and the answer to that is, "If you don't know what you're looking for, why are you doing analysis?" Seriously. If you don't know what the goals of your analysis are, why are you doing it?
Sometimes folks will say, "my goals are to find all bad stuff". Well...what constitutes "bad"? What if you find nmap and Metasploit installed on a system? Is it bad? What if the user is a network engineer tasked with vulnerability scanning and assessment? Then, is this find "bad"?
From the questions I receive, a lot of times I think that there is difficulty in defining goals. Does anyone really want to "find all malware"? Really? So you want to know about every BHO and bit of spyware? Given some home user systems, imagine how long it would take to locate, document, and categorize all malware on a system. Usually what I have found is that "find all malware" really means "find the malware that may have been associated with a particular incident or event". Then, getting the person you're working with to describe the event can help you narrow down the goals of your analysis. After all, they're talking to you for a reason, right? If they could do the analysis themselves, there would be no reason to talk to you. Developing a concise set of goals allows you to define and set expectations, as well as deliver something tangible and useful.
Timeline as a Tool
Timeline analysis is an extremely useful and valuable tool...but like any other tool, it's just a tool. The actual analysis is up to the analyst. There may be times when it simply doesn't make sense to create a timeline...if that's the case, then don't bother. However, if it does make sense to develop a timeline, then do so intelligently.
Volatility Updates
Gleeda recently tweeted about documentation for Volatility being available. If you do any memory forensics, this is an excellent resource that walks you through getting all set up and running with the full capabilities of Volatility.