Thread-Index Header Field

Jtylor · January 17, 2020, 6:52pm

Hello,

I’m examining a few emails and came across a header field named “Thread-Index” in some of them right above the “Date” header.

The field is populated similar to the example below:

Thread-Index: AdQc2DN7rLoS3hgnE/O76rpFzxN/EwAddF4A

Any idea what this represents? Could it contain information that can help us to find out if the emails are legitimate? Thanks in advance.

agungor · January 18, 2020, 5:15pm

This is a conversation index value (PR_CONVERSATION_INDEX MAPI property) in Base64 encoded form. I have some information on the format and a free tool to help decode it here: E-mail Conversation Index Analysis for Computer Forensics

Decoding this one results in the following output:

Conversation Index: AdQc2DN7rLoS3hgnE/O76rpFzxN/EwAddF4A
Header Timestamp: 07/16/2018 07:40:00.2799616 (UTC)
GUID: acba12de-1827-13f3-bbea-ba45cf137f13
Number of Children: 1

Child No: 1
Time Difference: 14:03:22.5476096
Mode: 0
Random No: 0
Sequence Count: 0
Calculated Timestamp: 07/16/2018 21:43:22.8275712 (UTC)

How Can This Help?

This might help in your investigation depending on what you are looking for. A few ideas:

1. Is the GUID Really Unique? The GUID part of the header block is designed to be unique. If you find the same GUID in multiple messages that seem completely disconnected (i.e., different participants, thread, etc.), then this might be a red flag.

2. Origination Date of First Message The header timestamp reflects the submission time of the initial message in the thread. You can compare this to PR_CLIENT_SUBMIT_TIME to corroborate the evidence. A major difference here can be a red flag.

3. Structure of The Thread As new messages are added to a thread (e.g., replies, forwards, etc.), the conversation index is expanded in 5-byte chunks. If the message you are looking at doesn’t match the thread structure reflected in the conversation index, this is a data point to consider. Keep in mind that the sender can change the quoted message body as they wish to alter the appearance of the message. So, some variations between the message structure and conversation index are to be expected.

4. Composition Time of Children In my testing, I’ve found that Outlook sets the time difference of child blocks based on when the child message is created rather than when it is actually submitted. In some cases, this might give you a clue as to how long the person took between creating the message (i.e., hitting reply, forward, etc.) and actually submitting the message (i.e., finishing composition and hitting “Send”).

I should add that I personally take conversation index evidence with a grain of salt and use it as a corroborating data point. This is mainly because the specification from Microsoft is not entirely clear, there are multiple, varying implementations some of which do not adhere to the specification, and there are many nuances due to local time being used and the effect of time inaccuracies on the devices that participate in an email thread. Still, a great tool to have in your toolbox when examining emails.

Jtylor · January 23, 2020, 6:20pm

Thank you! I have another thread index. I decoded the Base64 value and ended up with the hex representation below.

0101D449C3177E1347ABC447FA1327DB1C16FA410B21

This looks slightly different than the examples. Any tips on decoding it?

agungor · January 23, 2020, 11:13pm

This one actually seems to be more in line with Microsoft’s official documentation.

01 One reserved byte
01D449C317 5-bytes for the header date in FILETIME format
7E1347ABC447FA1327DB1C16FA410B21 GUID

Here is what I get:

Header Timestamp: 09/11/2018 11:32:15.3913344 (UTC)
GUID: 7e1347ab-c447-fa13-27db-1c16fa410b21
Number of Children: 0

thinktoday · May 27, 2020, 6:52am

Hii I have been trying to parse thread-index using java, tried your inputs which gives correct result date… but the problem is when I give thread-index from my mailbox it shows some date in year 1830

Here is result for AdQc2DN7rLoS3hgnE/O76rpFzxN/EwAddF4A mentioned in question…

Hex :01D41CD8337BACBA12DE182713F3BBEABA45CF137F13001D745E00
FILETIME: 01D41CD8337B
guid: ACBA12DE182713F3BBEABA45CF137F13
childs: [001D745E00]
zero paded file time: 01D41CD8337B0000
nano seconds: 131762004002799616
Result Time:2018-07-16T07:40:00.279961600Z

Here is my sample conversation Index : AQHWLRNo4NaOjvXU8EODe0ZotrA8B6itzaxf

Hex :0101D62D1368E0D68E8EF5D4F043837B4668B6B03C07A8ADCDAC5F
FILETIME: 0101D62D1368
guid: E0D68E8EF5D4F043837B4668B6B03C07
childs: [A8ADCDAC5F]
zero paded file time: 0101D62D13680000
nano seconds: 72574558102093824
Result Time: 1830-12-25T07:56:50.209382400Z

Also got wrong output for 0101D449C3177E1347ABC447FA1327DB1C16FA410B21 mentioned in one of the comments

Hex :0101D449C3177E1347ABC447FA1327DB1C16FA410B21
FILETIME: 0101D449C317
guid: 7E1347ABC447FA1327DB1C16FA410B21
childs: []
zero paded file time: 0101D449C3170000
nano seconds: 72572482285404160
Result Time: 1830-12-22T22:17:08.540416Z

Followed this link for building conversion code

agungor · May 27, 2020, 10:33pm

Unfortunately, not all conversation index values are created equally. You will have to make exceptions in your code to account for the differences.

I have highlighted the header timestamps in your examples:

0101D62D1368E0D68E8EF5D4F043837B4668B6B03C07A8ADCDAC5F
This decodes to 2020/05/18 12:54:02.6467328 (UTC)

0101D449C3177E1347ABC447FA1327DB1C16FA410B21
This decodes to 2018/09/11 11:32:15.3913344 (UTC)

It looks like you are parsing for the header timestamp one byte too soon. This is necessary in some cases, and isn’t in others.

thinktoday · May 28, 2020, 5:46am

Thank you @agungor It works . There is another problem, on parsing difference time through child blocks,
Consider this example mentioned in Question : AdQc2DN7rLoS3hgnE/O76rpFzxN/EwAddF4A I have got the correct difference time as you mentioned above, and here is flow…

Child Block Hex:001D745E00
Child Block Binary:0000000000011101011101000101111000000000	 Length:40
Child Block Segerated:0	0000000000111010111010001011110	0000	0000
Since have 1st bit as '0' added high 15 bits and the low 18 bits:
0000000000000000000000000111010111010001011110000000000000000000
Difference Time in NanoSeconds:506025476096
Difference Time in milliseconds:50602547
difference Time is : 14:03:22

But for Conversation Index: AQHWLRNo4NaOjvXU8EODe0ZotrA8B6itzaxf

zero paded file time  : 1D62D1368000000
nano seconds          : 132342800426467328
Result Header Time :2020-05-18T12:54:02.646732800Z(Got this correct) 
Hex:A8ADCDAC5F
Child Block Binary    : 1010100010101101110011011010110001011111	 Length:40
Child Block Segerated:1	0101000101011011100110110101100	0101	1111
Since have 1st bit as '1' added high 10 bits and the low 23 bits: 0000000000010100010101101110011011010110000000000000000000000000
Difference Time in NanoSeconds:5725048967004160
Difference Time in milliseconds:572504896700
difference Time is : 159029:08:16
Sum of header and child nanosecond:138067849393471488
Child Block Time   : 2038-07-09T18:02:19.347148800Z

When child block has 1st bit as 0 it works fine but I doubt that I have mistake when 1st bit is 1

thinktoday · May 29, 2020, 5:10am

I have gone through this article, on decoding child block
if the 1st bit is ‘0’
I have added high 15 bits and the low 18 bits as mentioned in article. But , it is not there what to do when 1st bit is ‘1’.
On refering microsoft doucument I tried adding high 10bits and low 23 bit when 1st bit is ‘1’.But the results are incorrecct

thinktoday · May 30, 2020, 3:33pm

For this conversation Index, the reply message orginal timedifference is less than 1.7 years according to the microsoft documentation , the child block should have started with 0, but it is not… Is there is any idea? Searched through many forums, most of them use this article as reference…

I have also tried parsing through the free tool suggested in the article it shows error msg…

Email_World · May 28, 2021, 4:12pm

How to determine whether to pass first byte or not. I have done a few experiments and found for some thread index values and some of them need 1st byte and some of them are not?

agungor · May 28, 2021, 5:28pm

Once you have it in hex, you want to locate the FILETIME structure. We have a table below that shows what a hex FILETIME value looks like for different points in time.

Dates in Hiding—Uncovering Timestamps in Forensic Email Examination (metaspike.com)

So, if your hex string starts with something between 01A - 01F, there is often no need to drop any bytes.

Email_World · June 4, 2021, 5:58pm

Message-ID: <0a6f01d5c4cc$b9145e70$2b3d1b50$@gmail.com >

In the above message ID highlighted ones shows the time (UTC) in filetime format. Anyone has the idea of the other values?

I would like to validate a few message ids and wondering whether the characters which are not highlighted have any significance.

@agungor - Appreciate if you can give any ideas.

Thank You