Malware RE - IR Disconnect
Not long ago, I'd conducted some analysis that I had found to be...well, pretty fascinating...and shared some of the various aspects of the analysis that were most fruitful. In particular, I wanted to share how various tools had been used to achieve the findings and complete the analysis.
Part of that analysis involved malware known as PlugX, and as such, a tweet that pointed to this blog post recently caught my attention. While the blog post, as well as some of the links in the post, contains some pretty fascinating information, I found that in some ways, it illustrates a disconnect between the DFIR and malware RE analysis communities.
I'd like to use the Cassidian blog post as an example and walk through what I, as a host-based analysis guy, see as some of the disconnects. I'm not doing this to highlight the post and say that something was done wrong or incorrectly...not at all. In fact, I greatly appreciate the information that was provided; however, I think that we can all agree that there are disconnects between the various infosec sub-communities, and my goal here is to see if we can't get folks from the RE and IR communities to come together just a bit more. So what I'll do is discuss/address the content from some of the sections if the Cassidian post.
Evolution
Seeing the evolution of malware, in general, is pretty fascinating, but to be honest, it really doesn't help DFIR analysts understand the malware, to the point where it helps them locate it on systems and answer the questions that the customer may have. However, again...it is useful information and is part of the overall intelligence picture that can be developed of the malware, it's use, and possibly even lead to (along with other information) attribution.
Network Communications
Whenever an analyst identifies network traffic, that information is valuable to SOC analysts and folks looking at network traffic. However, if you're doing DFIR work, many times you're handed a hard drive or an image and asked to locate the malware. As such, whenever I see a malware RE analyst give specifics regarding network traffic, particularly HTTP requests, I immediately want to know which API was used by the malware to send that traffic. I want to know this because it helps me understand what artifacts I can look for within the image. If the malware uses the WinInet API, I know to look in index.dat files (for IE versions 5 through 9), and depending upon how soon after some network communications I'm able to obtain an image of the system, I may be able to find some server responses in the pagefile. If raw sockets are used, then I'd need to look for different artifacts.
Where network communications has provided to be very useful during host-based analysis is during memory analysis, such as locating open network connections in a memory capture or hibernation file. Also, sharing information between malware RE and DFIR analysts has really pushed an examination to new levels, as in the case where I was looking at an instance where Win32/Crimea had been used by a bad guy. That case, in particular, illustrated to me how things could have taken longer or possibly even been missed had the malware RE analyst or I worked in isolation, whereas working together and sharing information provided a much better view of what had happened.
Configuration
The information described in the post is pretty fascinating, and can be used by analysts to determine or confirm other findings; for example, given the timetable, this might line up with something seen in network or proxy logs. There's enough information in the blog post that would allow an accomplished programmer to write a parser...if there were some detailed information about where the blob (as described in the post) was located.
Persistence
The blog post describes a data structure used to identify the persistence mechanism of the malware; in this case, that can be very valuable information. Specifically, if the malware creates a Windows service for persistence. This tells me where to look for artifacts of the malware, and even gives me a means for determining specific artifacts in order to nail down when the malware was first introduced on to the system. For example, if the malware uses the WinInet API (as mentioned above), that would tell me where to look for the index.dat file, based on the version of Windows I'm examining.
Also, as the malware uses a Windows service for persistence, I know where to look for other artifacts associated (Registry keys, Windows Event Log records, etc.) with the malware, again, based on the version of Windows I'm examining.
Unused Strings
In this case, the authors found two unused strings, set to "1234", in the malware configuration. I had seen a sample where that string was used as a file name.
Other Artifacts
The blog post makes little mention of other (specifically, host-based) artifacts associated with the malware; however, this resource describes a Registry key created as part of the malware installation, and in an instance I'd seen, the LastWrite time for that key corresponded to the first time the malware was run on the system.
In the case of the Cassidian post, it would be interesting to hear if the FAST key was found in the Registry; if so, this might be good validation, and if not, this might indicate either a past version of the malware, or a branch taken by another author.
Something else that I saw that really helped me nail down the first time that the malware was executed on the system was the existence of a subkey beneath the Tracing key in the Software hive. This was pretty fascinating and allowed me to correlate multiple artifacts in order to develop a greater level of confidence in what I was seeing.
Not specifically related to the Cassidian blog post, I've seen tweets that talk about the use of Windows shortcut/LNK files in a user's Startup folder as a persistence mechanism. This may not be particularly interesting to an RE analyst, but for someone like me, that's pretty fascinating, particularly if the LNK file does not contain a LinkInfo block.
Once again, my goal here is not to suggest that the Cassidian folks have done anything wrong...not at all. The information in their post is pretty interesting. Rather, what I wanted to do is see if we, as a community, can't agree that there is a disconnect, and then begin working together more closely. I've worked with a number of RE analysts, and each time, I've found that in doing so, the resulting analysis is more complete, more thorough, and provides more value to the customer. Further, future analysis is also more complete and thorough, in less time, and when dealing with sophisticated threat actors, time is of the essence.
Part of that analysis involved malware known as PlugX, and as such, a tweet that pointed to this blog post recently caught my attention. While the blog post, as well as some of the links in the post, contains some pretty fascinating information, I found that in some ways, it illustrates a disconnect between the DFIR and malware RE analysis communities.
Caveat
I've noticed this disconnect for quite some time, going back as far as at least this post...however, I'm also fully aware that AV companies are not in the business of making the job of DFIR analysts any easier. They have their own business model, and even if they actually do run malware (i.e., perform dynamic analysis), there is no benefit to them (the AV companies) if they engage in the detailed analysis of host-based artifacts. The simple fact and the inescapable truth is that an AV vendors goals are different from those of a DFIR analyst. The AV vendor wants to roll out an updated .dat file across the enterprise in order to detect and remove all instances of the malware, whereas a DFIR analyst is usually tasked with answering such questions as "...when did the malware first infect the system/infrastructure?", "...how did it get in?", and "...what data was taken?"
These are very different questions that need to be addressed, and as such, have very different models for the businesses/services that address them. This is not unlike the differences between the PCI assessors and the PCI forensic analysts.
Specifically, what some folks on one side find to be valuable and interesting may not be useful to folks on the other side. As such, what's left is two incomplete pictures of the overall threat to the customer, with little (if any) overlap between them. In the end, this simply leads not only both sides to having an incomplete view of what happened, and the result is that what's provided to the customer...the one with questions that need to be answered...aren't provided the value that could potentially be there.I've noticed this disconnect for quite some time, going back as far as at least this post...however, I'm also fully aware that AV companies are not in the business of making the job of DFIR analysts any easier. They have their own business model, and even if they actually do run malware (i.e., perform dynamic analysis), there is no benefit to them (the AV companies) if they engage in the detailed analysis of host-based artifacts. The simple fact and the inescapable truth is that an AV vendors goals are different from those of a DFIR analyst. The AV vendor wants to roll out an updated .dat file across the enterprise in order to detect and remove all instances of the malware, whereas a DFIR analyst is usually tasked with answering such questions as "...when did the malware first infect the system/infrastructure?", "...how did it get in?", and "...what data was taken?"
These are very different questions that need to be addressed, and as such, have very different models for the businesses/services that address them. This is not unlike the differences between the PCI assessors and the PCI forensic analysts.
I'd like to use the Cassidian blog post as an example and walk through what I, as a host-based analysis guy, see as some of the disconnects. I'm not doing this to highlight the post and say that something was done wrong or incorrectly...not at all. In fact, I greatly appreciate the information that was provided; however, I think that we can all agree that there are disconnects between the various infosec sub-communities, and my goal here is to see if we can't get folks from the RE and IR communities to come together just a bit more. So what I'll do is discuss/address the content from some of the sections if the Cassidian post.
Evolution
Seeing the evolution of malware, in general, is pretty fascinating, but to be honest, it really doesn't help DFIR analysts understand the malware, to the point where it helps them locate it on systems and answer the questions that the customer may have. However, again...it is useful information and is part of the overall intelligence picture that can be developed of the malware, it's use, and possibly even lead to (along with other information) attribution.
Network Communications
Whenever an analyst identifies network traffic, that information is valuable to SOC analysts and folks looking at network traffic. However, if you're doing DFIR work, many times you're handed a hard drive or an image and asked to locate the malware. As such, whenever I see a malware RE analyst give specifics regarding network traffic, particularly HTTP requests, I immediately want to know which API was used by the malware to send that traffic. I want to know this because it helps me understand what artifacts I can look for within the image. If the malware uses the WinInet API, I know to look in index.dat files (for IE versions 5 through 9), and depending upon how soon after some network communications I'm able to obtain an image of the system, I may be able to find some server responses in the pagefile. If raw sockets are used, then I'd need to look for different artifacts.
Where network communications has provided to be very useful during host-based analysis is during memory analysis, such as locating open network connections in a memory capture or hibernation file. Also, sharing information between malware RE and DFIR analysts has really pushed an examination to new levels, as in the case where I was looking at an instance where Win32/Crimea had been used by a bad guy. That case, in particular, illustrated to me how things could have taken longer or possibly even been missed had the malware RE analyst or I worked in isolation, whereas working together and sharing information provided a much better view of what had happened.
Configuration
The information described in the post is pretty fascinating, and can be used by analysts to determine or confirm other findings; for example, given the timetable, this might line up with something seen in network or proxy logs. There's enough information in the blog post that would allow an accomplished programmer to write a parser...if there were some detailed information about where the blob (as described in the post) was located.
Persistence
The blog post describes a data structure used to identify the persistence mechanism of the malware; in this case, that can be very valuable information. Specifically, if the malware creates a Windows service for persistence. This tells me where to look for artifacts of the malware, and even gives me a means for determining specific artifacts in order to nail down when the malware was first introduced on to the system. For example, if the malware uses the WinInet API (as mentioned above), that would tell me where to look for the index.dat file, based on the version of Windows I'm examining.
Also, as the malware uses a Windows service for persistence, I know where to look for other artifacts associated (Registry keys, Windows Event Log records, etc.) with the malware, again, based on the version of Windows I'm examining.
Unused Strings
In this case, the authors found two unused strings, set to "1234", in the malware configuration. I had seen a sample where that string was used as a file name.
Other Artifacts
The blog post makes little mention of other (specifically, host-based) artifacts associated with the malware; however, this resource describes a Registry key created as part of the malware installation, and in an instance I'd seen, the LastWrite time for that key corresponded to the first time the malware was run on the system.
In the case of the Cassidian post, it would be interesting to hear if the FAST key was found in the Registry; if so, this might be good validation, and if not, this might indicate either a past version of the malware, or a branch taken by another author.
Something else that I saw that really helped me nail down the first time that the malware was executed on the system was the existence of a subkey beneath the Tracing key in the Software hive. This was pretty fascinating and allowed me to correlate multiple artifacts in order to develop a greater level of confidence in what I was seeing.
Not specifically related to the Cassidian blog post, I've seen tweets that talk about the use of Windows shortcut/LNK files in a user's Startup folder as a persistence mechanism. This may not be particularly interesting to an RE analyst, but for someone like me, that's pretty fascinating, particularly if the LNK file does not contain a LinkInfo block.
Once again, my goal here is not to suggest that the Cassidian folks have done anything wrong...not at all. The information in their post is pretty interesting. Rather, what I wanted to do is see if we, as a community, can't agree that there is a disconnect, and then begin working together more closely. I've worked with a number of RE analysts, and each time, I've found that in doing so, the resulting analysis is more complete, more thorough, and provides more value to the customer. Further, future analysis is also more complete and thorough, in less time, and when dealing with sophisticated threat actors, time is of the essence.