A Practical Approach to Webmail Forensics Techniques | Lucideus Research

“Introduction to the manual procedures and techniques involved in investigating webmail/cloud-based email storage services”

Cloud-based email service provider such as google, yahoo, outlook, etc. Collect a huge amount of data from the users. Weighing from a forensics point of view the collected data consists communication emails, personal data storage, and exchange.


According to usage trends, users prefer using email services in mobile devices, organizational devices, and workstations. However, the main storage is available in the cloud which can be analyzed in an investigation.


Email forensics investigation has the potentiality to find the smoking gun amongst a stockpile of artefacts. Thus it is viable for an investigator to analyze email communications and dig unusual clues in an investigation.


Data Acquisition

To begin with an email investigation, the primary step is to acquire data for analysis. The main difference in artefacts depends on the amount of information acquired for analysis. Here’s how you can acquire email evidence for analysis:


  1. Google Takeout (Creating an offline archive of Google data)

  2. Using email client to synchronize email account using POP3/IMAP synchronization

Google Takeout

Google takeout provides creation of an offline google account backup which consists of all the google application data stored in a user account.
Google Takeout - Download your data

Downloading Gmail Email Data

  • Investigator can download specific Gmail application data which is provided by Google in MBOX file format.

    Google - Download MBOX data
  • Archive from Google including emails
Takeout Data in MBOX file format

Email Client Synchronization

Email clients such as Outlook, Thunderbird & apple mail can be utilized to download the email data. Investigators need to ensure the data which is being downloaded should not be modified or tampered while or after synchronization with the email client.
Gmail usually provides 2 protocols for an email client to synchronize the data such as IMAP and POP3 protocol.

Which protocol to use for creating an archive?

  • IMAP protocol simply allows multiple device synchronization which means realtime changes will apply when synchronized with IMAP protocol.
    • Archive Format : OST (Offline Storage table)
      • Outlook creates an OST file when you synchronize with IMAP protocol.
      • OST file can be converted to PST file format which can be used for convenient email extraction & analysis.
      • Storage location : %AppData%\Local\Microsoft\Outlook
OST file
  • POP3(Post Office Protocol) however offers one time synchronization for the device. Which means POP3 protocol will not synchronize amongst multiple devices
    • Archive Format : PST(Personal Storage Table)
      • Outlook creates PST file when Gmail or other email accounts are synchronized with POP protocol
      • PST file can be used for analysis as it contains email data, meetings, calendar, contacts and people information.
      • Storage Location : %UserRoot%\Documents\Outlook Files
PST file

Email Forensic Analysis

While investigating emails, the primary goal is to identify, collect and categorize evidence. To identify evidence, investigators can perform the following procedures:
  • Keyword based searching
  • Event Date/Time wise search
  • Suspected sender/receiver based searching

Email Header Analysis

Investigating email headers is a crucial aspect of an investigation as email metadata and other information are present within the email headers.
Email header analysis can reveal the source, destination, email client, sender IP, spoofed or authentic email identification and much more.

Google Takeout MBOX file analysis:



Mbox file contains information in plain text format and can be easily traversed using a hex editor or notepad.
MBOX file analysis
  1. Start of an email is denoted using “From”
  2. 1589818349288168488@xxx is the X-GM-THRID header information
  3. Followed by received timestamp in DDD MMM YY HH:MM:SS (Receiving server Time Zone) YYYY
  4. X-GM-THRID is a thread ID to associate groups of messages in the same manner as in the web interface
  5. X-Gmail-Labels defines where the message was stored and in which category e.g.(Important, Sent, Inbox)
  6. Return-Path defines the email address or the destination where any reply to the particular email will be sent. Return path also helps determine if a email is disguised by validating if return path is different from what is defined in the message.
  7. Received: Defines the recipient information
    1. From : Recipient Name , IP address
    2. By SMTP address (smtp.Gmail.com) with ESMTPSA id
      1. ESMTPSA id helps you to identify the source or the mail transfer agent server details .
      2. ESMTPSA helps to identify that a secure SMTP server connection was used to send the message
    3. “For” email address designates the recipients name
    4. Version defines the SSL protocol & the version used .For e.g.(version=TLS1) defines TLS version 1 was used.
    5. cipher=ECDHE-RSA-AES128-SHA bits=128/128 defines the following:
      1. ECDHE : elliptic curve diffie-hellman key exchange protocol was used
        • It is an  anonymous key agreement protocol that allows sender & receiver, each having an elliptic-curve public–private key pair, to establish a shared secret over an insecure channel
      2. RSA was used for key generation using AES128 encryption algorithm
      3. Message was hashed using SHA1 with a key size of 128 bits
    6. From: "Sender name"
    7. To:
    8. Subject: email subject
    9. Date: DDD, DD MM YYYY HH:MM:SS (Time Zone)
  8. Message-ID:
    1. Message ID is a unique identifier of a message & defines the version of the message
    2. If a message was forged then the investigator can check its authenticity if similar message was modified, forged and sent.
  9. MIME-Version: 1.0 defines the email MIME version & the content type as
    1. text/plain
    2. text/html
    3. image/jpeg
    4. image/png
    5. audio/mpeg
    6. audio/ogg
    7. audio/*
    8. video/mp4
    9. application/octet-stream
    10. multipart/mixed
  10. MIME are seperated with boundaries, multipart/mixed emails will have starting boundary for every MIME type present in the email message.
  11. X-Mailer: Defines the email client used for sending or receiving an email for example Microsoft Outlook 15.0
  12. ARC Headers:ARC preserves an email authentication results and verifies the identity of email intermediaries that forward a message on to its final recipient. It also helps to establish if an email has been spoofed or modified in transit or not. There are three key components to ARC:
ARC headers present in the MBOX file
    1. Authentication results header : Contains email authentication results in form of the following records:
ARC Authentication Results


      1. SPF(Sender Policy Framework)
        • spf=pass (domain address: domain of XXX designates XXX.XXX.XXX.XXX as permitted sender)
        • SPF designates if a domain address is a permitted sender or not
        • SPF record contain results with pass or fail information
      2. DKIM(Domain Key Identified Mails)
        • DKIM record lets sender organization to take authority of the message in transit.
        • dkim=pass header.i=@domain address header.s=XXXXXX header.b=XXXXXX
        • header.b = the actual digital signature of the contents (headers and body) of the mail message
        • header.s = the selector
      3. DMARC(Domain-based Message Authentication)
        • DMARC provides email protection and prevents fraudulent use of legitimate brands for email forging or attacks
        • dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com
        • p=REJECT & sp=REJECT  instructs the receiver not to deliver email that does not pass either SPF or DKIM authentication for a given domain.
        • dis=NONE means google added disposition=none which means the message was forwarded
    1. ARC signature :Consists email message content signature that takes a snapshot of the message header information, including the to, from, subject, and body
ARC Message Signatures
    1. ARC Seal: Consists email message content signature that includes the ARC Signature and the ARC Authentication Results header information
ARC Seal

Observation:

  • Email header was available in plain text format when we downloaded Email data from Google Takeout.
  • Which resulted in many pointers which can be used to identify source, agent, server, phishing activity, spoofing activity or authenticate legitimacy of an email evidence.

Additional Sources of Investigation:

Analyzing Emails within Gmail web interface:
  1. Analyzing suspected emails
    1. In the screenshot below is an example of spam emails being received in inbox which can be utilized for malicious intent.

Sample Spoofed Email
    1. Things investigator should look out for is the sender email address and the corresponding name which might deceive the user
    2. Images usually contain malicious content or code which might lead to information leakage hence should not be previewed
    3. The third pointer depicts that the user had subscribed to the website which has been the cause of receiving spam emails
    4. Show original option lets you investigate the real contents of the message with email header information
Indicators of Spoofing - At a Glance
    1. Original message contains email header information along with complete email information presented in raw format. Download original option allows investigator to perform offline analysis of the email
    2. Email header reveals the message ID & created at (Date/Time) which was delivered after 50 seconds
    3. From and email address do not link up with the information provided hence information masking was attempted
    4. SPF and DKIM records were passed as the user added the sender to its trusted domain by subscribing to its services
Original Message Header


Observations

  • Sender information & email address mismatches the information depicted
  • Email header has Precedence: bulk which corresponds to the email message being part of bulk mailing system.
  • DKIM signature corresponds to Content type : From:To:Reply-To:List-ID:List-Unsubscribe clearly depicts & confirms the message to be bulk message, user can unsubscribe and recipient is part of a mailing list
Indications of Spoofing - Detailed Analysis

Investigating Account Activity

  • Google stores account activity information such as search queries, voice instructions, last logins, device & operating system usage.
  • This information helps investigator build a timeline of events of how an event took place .
  • Correlation with activities & evidence tells an exact story of how an event occured & activities performed to accomplish it.

Google Account Activity

Account Sessions & Activity

  • Account Activity information also includes list of logged in sessions or previous sessions.
  • Information such as login IP address & GEOIP corresponding to login location can be figured out easily.
  • Access Type Information helps determine the agent used to access google services such as mobile, browser, email client, etc.

Account Activity

Archived PST/OST File Analysis:

  • Archives created using microsoft outlook email client store email data in the form of PST or OST file.
  • PST file can be opened using microsoft outlook email client, however once OST file is created it cannot be reanalyzed after detaching .
  • Investigators need to convert OST to PST file format for better analysis.

OST File Format

OST File - Hex View
  • OST file contains information in an encrypted format which requires a specific file viewer to preview its contents.

PST File Format

PST File - Hex View
  • PST files also contain information in an encrypted format which requires a file viewer to preview its contents.
  • PST file is a container of all the email messages, calendar, people & appointment data . It can be attached to Microsoft Outlook email client and further analyzed

Analyzing Emails in Microsoft Outlook Email Client

Keyword based searching

  • Specific keyword based searches which includes regular expression searching methods can be performed using outlook.
  • Date/Time wise email filtration can also be performed
Keyword Searching using Microsoft Outlook

Email Header Analysis

  • Outlook provides previewing an email message in rich text format with the attachments & also allows users to inspect email headers.
  • Email header can be analyzed by navigating towards email properties which will depict the following information
Email Header Analysis using Microsoft Outlook

Conclusion

The above mentioned acquisition and analysis process is meant for manual investigations. However forensic utilities such as magnet forensics - AXIOM, oxygen cloud analyzer, etc. cloud acquisition utilities perform the similar procedure with API integration. It is recommended for an investigator to rely on automated utilities as indexing & searching becomes faster and easier.Email forensics does not limit its scope to email investigation as correlation from artifacts such as account activities, cloud data & google app data should be correlated with proper event re-construction.
Investigations involving data leakage, phishing attacks, malware analysis, hacking & accomplishment of malicious activity requires an investigator to perform email forensic investigation. For Webmails it is advised to perform complete cloud account acquisition to uncover artifacts.