Gmail History Records in Forensic Email Investigations

agungor · July 7, 2020, 5:14pm

Just published a new blog post on Gmail History Records:

We added support for Gmail History Record acquisition in FEC’s last update, but haven’t had a chance to go over what history records are, and how we can use them. This post aims to provide some information as well as a couple of use cases

andrisreinman · July 8, 2020, 10:20am

I’ve been following the Gmail hidden details posts and found these very interesting. I have also one thing to share about dates in Gmail, not sure if this has been covered so far - it appears that Gmail uses timestamp as part of their proprietary X-GM-MSGID and X-GM-THRID message properties for IMAP. It is kind of useless as you get the same date also for INTERNALDATE. The only value I can think of is using X-GM-THRID as the date marker for the first message in a thread in case of the actual message was already deleted.

Eg. you receive a message, and a new X-GM-THRID is created using current time as the value, you reply to the message and the reply gets the same X-GM-THRID as the original message. You now delete the first message. Even though the message is now gone you can still see that there was some kind of a message from the X-GM-THRID value in the reply in your Sent Mail folder. You can then match this date value with the Message-ID found from the References or In-Reply-To header.

Anyway, the X-GM-MSGID/THRID values look like this: 1671369405424894052
You can convert it into a timestamp by converting this value to hex: 1731e59e72ce8064
And then taking the first 11 bytes and converting the result into a decimal number: 1731e59e72c->1593942075180
Remove the last 3 bytes and you end up with a unix timestamp: 1593942075 -> Sunday, 5. July 2020 9:41:15

agungor · July 8, 2020, 5:41pm

Hi Andris,

This is extremely helpful, thank you! Although some of these timestamps might be duplicated elsewhere, it is important to check all available timestamps during a forensic investigation to make sure they all agree with each other. Otherwise, there is a chance that the message was altered and some of the timestamps had been overlooked. So, this is awesome!

I would like to bring some clarity for anyone who is not familiar with the data points Andris mentioned above. These are actually the internal IDs (or service IDs)—not to be confused with Message-ID—and Thread IDs—not to be confused with Thread-Index or Conversation Index—of messages within Gmail. So, they are applicable regardless of the use of IMAP and they can be queried via Gmail API. Gmail also provides some IMAP protocol extensions that allow these values, among others, to be queried over IMAP as well—X-GM-MSGID and X-GM-THRID.

If you are familiar with FEC, you would see these values in our Gmail API Downloaded Items logs in the Service ID and Thread ID columns in hexadecimal form.

As Andris explained brilliantly, taking the first eleven digits of the hex values and converting them to decimal gives us an Epoch timestamp. Here is an example from a recent Downloaded Items log (I took out some columns for brevity):

|Service ID|MIME Hash |Gmail Labels|Thread ID|
|—|—|—|—|—|
|1732f1fc6ee3a193|8FA55CD0…|IMPORTANT;CATEGORY_PERSONAL;INBOX|1732f1fc6ee3a193|
|173106a253a37c64|06917114…|UNREAD;CATEGORY_UPDATES;INBOX|173106a253a37c64|

Let’s take the Service ID of the first message:

Service ID: 1732f1fc6ee3a193

First 11 digits converted to decimal: 1732f1fc6ee ➫ 1594223478510

Conversion from Epoch to Human Readable: 1594223478510 ➫ July 8, 2020 3:51:18.510 PM (UTC)

The July 8, 2020 3:51:18.510 PM (UTC) timestamp matches the time Google’s server received the message after it was submitted.

I didn’t remove the last three decimal digits; just treated them as the millisecond component of the Epoch timestamp.

As Andris mentioned, the Thread ID is even more valuable because it is carried over from previous messages in the same conversation thread!

Note that I also had a Thread-Index header field in this message as part of the message headers. This was created by Outlook and looked as follows:

Thread-Index: AdZVP5FZ1lMVdyWGSQmgf4yUt3eB7w==

Decoding this gives a header timestamp of July 8, 2020 3:50:56.3837952 PM (UTC)—moments before the timestamp we decoded from the message ID. This makes sense because the Thread-Index was created on the client-side before the message hit the server. We covered this before here:

Thanks to Andris, I think another update to the Dates in Hiding blog post is in order!

andrisreinman · July 8, 2020, 6:17pm

Thanks! I was actually looking for a way to get “real” timestamp when a message was added to a mailbox. Unfortunately the timestamp in MSGID/THRID values includes the INTERNALDATE value that can be set to any value with an APPEND command. This also means that while UID and sequence numbers in IMAP are always incremental (message added later has larger UID number), then MSGID/THRID values are not (message added later may have a lower MSGID number than existing messages).