BinMode: Parsing Java *.idx files
One of the Windows artifacts that I talk about in my training courses is application log files, and I tend to sort of gloss over this topic, simply because there are so many different kinds of log files produced by applications. Some applications, in particular AV, will write their logs to the Application Event Log, as well as a text file. I find this to be very useful because the Application Event Log will "roll over" as it gathers more events; most often, the text logs will continue to be written to by the application. I talk about these logs in general because it's important for analysts to be aware of them, but I don't spend a great deal of time discussing them because we could be there all week talking about them.
With the recent (Jan, 2013) issues regarding a Java 0-day vulnerability, my interest in artifacts of compromise were piqued yet again when I found that someone had released some Python code for parsing Java deployment cache *.idx files. I located the *.idx files on my own system, opened a couple of them up in a hex editor and began conducting pattern analysis to see if I could identify a repeatable structure. I found enough information to create a pretty decent parser for the *.idx files to which I have access.
Okay, so the big question is...so what? Who cares? Well, Corey Harrell had an excellent post to his blog regarding Finding (the) Initial Infection Vector, which I think is something that folks don't do often enough. Using timeline analysis, Corey identified artifacts that required closer examination; using the right tools and techniques, this information can also be included directly into the timeline (see the Sploited blog post listed in the Resources section below) to provide more context to the timeline activity.
The testing I've been able to do with the code I wrote has been somewhat limited, as I haven't had a system that might be infected come across my desk in a bit, and I don't have access to an *.idx file like what Corey illustrated in his blog post (notice that it includes "pragma" and "cache control" statements). However, what I really like about the code is that I have access to the data itself, and I can modify the code to meet my analysis needs, much the way I did with the Prefetch file analysis code that I wrote. For example, I can perform frequency analysis of IP addresses or URLs, server types, etc. I can perform searches for various specific data elements, or simply run the output of the tool through the find command, just to see if something specific exists. Or, I can have the code output information in TLN format for inclusion in a timeline.
Regardless of what I do with the code itself, I know have automatic access to the data, and I have references included in the script itself; as such, the headers of the script serve as documentation, as well as a reminder of what's being examined, and why. This bridges the gap between having something I need to check listed in a spreadsheet, and actually checking or analyzing those artifacts.
Resources
ForensicsWiki Page: Java
Sploited blog post: Java Forensics Using TLN Timelines
jIIr: Almost Cooked Up Some Java, Finding Initial Infection Vector
Interested in Windows DF training? Check it out: Timeline Analysis, 4-5 Feb; Windows Forensic Analysis, 11-12 Mar.
With the recent (Jan, 2013) issues regarding a Java 0-day vulnerability, my interest in artifacts of compromise were piqued yet again when I found that someone had released some Python code for parsing Java deployment cache *.idx files. I located the *.idx files on my own system, opened a couple of them up in a hex editor and began conducting pattern analysis to see if I could identify a repeatable structure. I found enough information to create a pretty decent parser for the *.idx files to which I have access.
Okay, so the big question is...so what? Who cares? Well, Corey Harrell had an excellent post to his blog regarding Finding (the) Initial Infection Vector, which I think is something that folks don't do often enough. Using timeline analysis, Corey identified artifacts that required closer examination; using the right tools and techniques, this information can also be included directly into the timeline (see the Sploited blog post listed in the Resources section below) to provide more context to the timeline activity.
The testing I've been able to do with the code I wrote has been somewhat limited, as I haven't had a system that might be infected come across my desk in a bit, and I don't have access to an *.idx file like what Corey illustrated in his blog post (notice that it includes "pragma" and "cache control" statements). However, what I really like about the code is that I have access to the data itself, and I can modify the code to meet my analysis needs, much the way I did with the Prefetch file analysis code that I wrote. For example, I can perform frequency analysis of IP addresses or URLs, server types, etc. I can perform searches for various specific data elements, or simply run the output of the tool through the find command, just to see if something specific exists. Or, I can have the code output information in TLN format for inclusion in a timeline.
Regardless of what I do with the code itself, I know have automatic access to the data, and I have references included in the script itself; as such, the headers of the script serve as documentation, as well as a reminder of what's being examined, and why. This bridges the gap between having something I need to check listed in a spreadsheet, and actually checking or analyzing those artifacts.
Resources
ForensicsWiki Page: Java
Sploited blog post: Java Forensics Using TLN Timelines
jIIr: Almost Cooked Up Some Java, Finding Initial Infection Vector
Interested in Windows DF training? Check it out: Timeline Analysis, 4-5 Feb; Windows Forensic Analysis, 11-12 Mar.