HowTo: Detecting Persistence Mechanisms
This post is about actually detecting persistence mechanisms...not querying them, but detecting them. There's a difference between querying known persistence mechanisms, and detecting previously unknown persistence mechanisms used by malware; the former we can do with tools such as AutoRuns and RegRipper, but the latter requires a bit more work.
Detecting the persistence mechanism used by malware can be a critical component of an investigation; for one, it helps us determine the window of compromise, or how long it's been since the system was infected (or compromised). For PCI exams in particular, this is important because many organizations know approximately how many credit card transactions they process on a daily or weekly basis, and combining this information with the window of compromise can help them estimate their exposure. If malware infects a system in a user context but does not escalate it's privileges, then it will mostly likely start back up after a reboot only after that user logs back into the system. If the system is rebooted and another user logs in (or in the case of a server, no user logs in...), then the malware will remain dormant.
Detecting Persistence Mechanisms
Most often, we can determine a malware persistence mechanism by querying the system with tools such as those mentioned previously in this post. However, neither of these tools is comprehensive enough to cover other possible persistence mechanisms, and as such, we need to seek other processes or methods of analysis and detection.
One process that I've found to be very useful is timeline analysis. Timelines provide us with context and an increased relative confidence in our data, and depending upon which data we include in our timeline, an unparalleled level of granularity.
Several years ago, I determined the existence of a variant of W32/Crimea on a system (used in online banking fraud) by creating a timeline of system activity. I had started by reviewing the AV logs from the installed application, and then moved on to scanning the image (mounted as a volume) with several licensed commercial AV scanners, none of which located any malware. I finally used an AV scanner called "a-squared" (now apparently owned by Emsisoft), and it found a malicious DLL. Using that DLL name as a pivot point within my timeline, I saw that relatively close to the creation date of the malicious DLL, the file C:\Windows\system32\imm32.dll was modified; parsing the file with a PE analysis tool, I could see that the PE Import Table had been modified to point to the malicious DLL. The persistence mechanism employed by the malware was to 'attach' to a DLL that is loaded by user processes that interact with the keyboard, in particular web browsers. It appeared to be a keystroke logger that was only interested in information entered into form fields in web pages for online banking sites.
Interestingly enough, this particular malware was very well designed, in that it did not write the information it collected to a file on the system. Instead, it immediately sent the information off of the system to a waiting server, and the only artifacts that we could find of that communication were web server responses embedded in the pagefile.
Something else to consider is the DLL Search Order "issue", often referred to as hijacking. This has been discussed at length, and likely still remains an issue because it's not so much a specific vulnerability that can be patched or fixed, but more a matter of functionality provided by the architecture of the operating system.
In the case of ntshrui.dll (discussed here by Nick Harbour, while he was still with Mandiant), this is how it worked...ntshrui.dll is listed in the Windows Registry as an approved shell extension for Windows Explorer. In the Registry, many of the approved shell extensions have explicit paths listed...that is, the value is C:\Windows\system32\some_dll.dll, and Windows knows to go load that file. Other shell extensions, however, are listed with implicit paths; that is, only the name of the DLL is provided, and when the executable (explorer.exe) loads, it has to go search for that DLL. In the case of ntshrui.dll, the legitimate copy of the DLL is located in the system32 folder, but another file of the same name had been created in the C:\Windows folder, right next to the explorer.exe file. As explorer.exe starts searching for the DLL in it's own directory, it happily loaded the malicious DLL without any sort of checking, and therefore, no errors were thrown.
Around the time that Nick was writing up his blog post, I'd run across a Windows 2003 system that had been compromised, and fortunately for me, the sysadmins had a policy for a bit more extensive logging enabled on systems. As I was examining the timeline, starting from the most recent events to occur, I marveled at how the additional logging really added a great deal of granularity to thing such as a user logging in; I could see where the system assigned a token to the user, and then transferred the security context of the login to that user. I then saw a number of DLLs being accessed (that is, their last accessed times were modified) from the system32 folder...and then I saw one (ntshrui.dll) from the C:\Windows folder. This stood out to me as strange, particularly when I ran a search across the timeline for that file name, and found another file of the same name in the system32 folder. I began researching the issue, and was able to determine that the persistence mechanism of the malware was indeed the use of the DLL search order "vulnerability".
Creating Timelines
Several years ago, I was asked to write a Perl script that would list all Registry keys within a hive file, along with their LastWrite times, in bodyfile format. Seeing the utility of this information, I also wrote a version that would output to TLN format, for inclusion in the timelines I create and use for analysis. This allows for significant information that I might not otherwise see to be included in the timeline; once suspicious activity has been found, or a pivot point located, finding unusual Registry keys (such as those beneath the CLSID subkey) can lead to identification of a persistence mechanism.
Additional levels of granularity can be achieved in timelines through the incorporation of intelligence into the tools used to create timelines, something that I started adding to RegRipper with the release of version 2.8. One of the drawbacks to timelines is that they will show the creation, last accessed, and last modification times of files, but not incorporate any sort of information regarding the contents of that file into the timeline. For example, a timeline will show a file with a ".tmp" extension in the user's Temp folder, but little beyond that; incorporating additional functionality for accessing such files would allow us to include intelligence from previous analyses into our parsing routines, and hence, into our timelines. As such, we may want to generate an alert for that ".tmp" file, specifically if the binary contents indicate that it is an executable file, or a PDF, or some other form of document.
Another example of how this functionality can be incorporated into timelines and assist us in detecting persistence mechanisms might be to add grep() statements to RegRipper plugins that parse file paths from values. For example, your timeline would include the LastWrite time for a user's Run key as an event, but because the values for this key are not maintained in any MRU order, there's really nothing else to add. However, if your experience were to show that file paths that include "AppData", "Application Data", or "Temp" might be suspicious, why not add checks for these to the RegRipper plugin, and generate an alert if one is found? Would you normally expect to see a program being automatically launched from the user's "Temporary Internet Files" folder, or is that something that you'd like to be alerted on. The same sort of thing applies to values listed in the InProcServer keys beneath the CLSID key in the Software hive.
Adding this alerting functionality to tools that parse data into timeline formats can significantly increase the level of granularity in our timelines, and help us to detect previously unknown persistence mechanisms.
Resources
Mandiant: Malware Persistence without the Windows Registry
Mandiant: What the fxsst?
jIIR: Finding Malware like Iron Man
jIIR: Tracking down persistence mechanisms
Detecting the persistence mechanism used by malware can be a critical component of an investigation; for one, it helps us determine the window of compromise, or how long it's been since the system was infected (or compromised). For PCI exams in particular, this is important because many organizations know approximately how many credit card transactions they process on a daily or weekly basis, and combining this information with the window of compromise can help them estimate their exposure. If malware infects a system in a user context but does not escalate it's privileges, then it will mostly likely start back up after a reboot only after that user logs back into the system. If the system is rebooted and another user logs in (or in the case of a server, no user logs in...), then the malware will remain dormant.
Detecting Persistence Mechanisms
Most often, we can determine a malware persistence mechanism by querying the system with tools such as those mentioned previously in this post. However, neither of these tools is comprehensive enough to cover other possible persistence mechanisms, and as such, we need to seek other processes or methods of analysis and detection.
One process that I've found to be very useful is timeline analysis. Timelines provide us with context and an increased relative confidence in our data, and depending upon which data we include in our timeline, an unparalleled level of granularity.
Several years ago, I determined the existence of a variant of W32/Crimea on a system (used in online banking fraud) by creating a timeline of system activity. I had started by reviewing the AV logs from the installed application, and then moved on to scanning the image (mounted as a volume) with several licensed commercial AV scanners, none of which located any malware. I finally used an AV scanner called "a-squared" (now apparently owned by Emsisoft), and it found a malicious DLL. Using that DLL name as a pivot point within my timeline, I saw that relatively close to the creation date of the malicious DLL, the file C:\Windows\system32\imm32.dll was modified; parsing the file with a PE analysis tool, I could see that the PE Import Table had been modified to point to the malicious DLL. The persistence mechanism employed by the malware was to 'attach' to a DLL that is loaded by user processes that interact with the keyboard, in particular web browsers. It appeared to be a keystroke logger that was only interested in information entered into form fields in web pages for online banking sites.
Interestingly enough, this particular malware was very well designed, in that it did not write the information it collected to a file on the system. Instead, it immediately sent the information off of the system to a waiting server, and the only artifacts that we could find of that communication were web server responses embedded in the pagefile.
Something else to consider is the DLL Search Order "issue", often referred to as hijacking. This has been discussed at length, and likely still remains an issue because it's not so much a specific vulnerability that can be patched or fixed, but more a matter of functionality provided by the architecture of the operating system.
In the case of ntshrui.dll (discussed here by Nick Harbour, while he was still with Mandiant), this is how it worked...ntshrui.dll is listed in the Windows Registry as an approved shell extension for Windows Explorer. In the Registry, many of the approved shell extensions have explicit paths listed...that is, the value is C:\Windows\system32\some_dll.dll, and Windows knows to go load that file. Other shell extensions, however, are listed with implicit paths; that is, only the name of the DLL is provided, and when the executable (explorer.exe) loads, it has to go search for that DLL. In the case of ntshrui.dll, the legitimate copy of the DLL is located in the system32 folder, but another file of the same name had been created in the C:\Windows folder, right next to the explorer.exe file. As explorer.exe starts searching for the DLL in it's own directory, it happily loaded the malicious DLL without any sort of checking, and therefore, no errors were thrown.
Around the time that Nick was writing up his blog post, I'd run across a Windows 2003 system that had been compromised, and fortunately for me, the sysadmins had a policy for a bit more extensive logging enabled on systems. As I was examining the timeline, starting from the most recent events to occur, I marveled at how the additional logging really added a great deal of granularity to thing such as a user logging in; I could see where the system assigned a token to the user, and then transferred the security context of the login to that user. I then saw a number of DLLs being accessed (that is, their last accessed times were modified) from the system32 folder...and then I saw one (ntshrui.dll) from the C:\Windows folder. This stood out to me as strange, particularly when I ran a search across the timeline for that file name, and found another file of the same name in the system32 folder. I began researching the issue, and was able to determine that the persistence mechanism of the malware was indeed the use of the DLL search order "vulnerability".
Creating Timelines
Several years ago, I was asked to write a Perl script that would list all Registry keys within a hive file, along with their LastWrite times, in bodyfile format. Seeing the utility of this information, I also wrote a version that would output to TLN format, for inclusion in the timelines I create and use for analysis. This allows for significant information that I might not otherwise see to be included in the timeline; once suspicious activity has been found, or a pivot point located, finding unusual Registry keys (such as those beneath the CLSID subkey) can lead to identification of a persistence mechanism.
Additional levels of granularity can be achieved in timelines through the incorporation of intelligence into the tools used to create timelines, something that I started adding to RegRipper with the release of version 2.8. One of the drawbacks to timelines is that they will show the creation, last accessed, and last modification times of files, but not incorporate any sort of information regarding the contents of that file into the timeline. For example, a timeline will show a file with a ".tmp" extension in the user's Temp folder, but little beyond that; incorporating additional functionality for accessing such files would allow us to include intelligence from previous analyses into our parsing routines, and hence, into our timelines. As such, we may want to generate an alert for that ".tmp" file, specifically if the binary contents indicate that it is an executable file, or a PDF, or some other form of document.
Another example of how this functionality can be incorporated into timelines and assist us in detecting persistence mechanisms might be to add grep() statements to RegRipper plugins that parse file paths from values. For example, your timeline would include the LastWrite time for a user's Run key as an event, but because the values for this key are not maintained in any MRU order, there's really nothing else to add. However, if your experience were to show that file paths that include "AppData", "Application Data", or "Temp" might be suspicious, why not add checks for these to the RegRipper plugin, and generate an alert if one is found? Would you normally expect to see a program being automatically launched from the user's "Temporary Internet Files" folder, or is that something that you'd like to be alerted on. The same sort of thing applies to values listed in the InProcServer keys beneath the CLSID key in the Software hive.
Adding this alerting functionality to tools that parse data into timeline formats can significantly increase the level of granularity in our timelines, and help us to detect previously unknown persistence mechanisms.
Resources
Mandiant: Malware Persistence without the Windows Registry
Mandiant: What the fxsst?
jIIR: Finding Malware like Iron Man
jIIR: Tracking down persistence mechanisms