01/11/2011 – Please see the update about message id’s
In this post I will describe some of the methodology used in the message tracking script posted earlier.
We’re going to gather information on how much email the users are sending and receiving. Measuring email can be a rather ambiguous proposition. When a user sends an email with a 10MB attachement to a distribution list consisting of 10 people, was that 1 email, or 10? 10 megabytes of email, or 100? The answer is “it depends”.
To a department manager that wants to know how much time thier employees are spending sending email, that counts as 1 email, and 10MB. To the Exchange admin it means bandwidth, disk space, and cpu, and it’s 10 emails and 100MB. To make the report useful to both, we need to get figures for both.
So the first thing we so is lay out the counters we want to accumulate.
For the purpose of this report, I’m considering a user composing and sending an email a unique event. For each user, we want to know the count and total bytes of:
The unique emails sent – internally, externally, and total
The emails sent after DL expansion internally
The total emails sent
The emails received – internally, externally, and total
Now that we know what we’re after, we need to figure out where it is, and how we’re going to get it. First, where is it?
One email can generate many events in the message tracking logs, depending on the email and your environment.
All of these counters can be gathered from just two types of events:
RECEIVE events whose source is STOREDRIVER, and DELIVER events.
For unique emails sent, we’re going to rely on the Receive events that have a Source of STOREDRIVER. This the first event in the transport log for a given email, when it was initially submitted by the Mailbox Server STOREDRIVER to the Hub Transport server for processing. It represents one instance of the user composing an email and hitting “Send” and is the natural choice for gathering stats on unique send events.
This event tells us whether it was sent to internal recipients, external recipients, or both, and how big the email was. It will also tell us how many external recipients it was sent to. Because no Exchange address list expansion has been done yet, it will not tell us how many internal recipients it was sent to.
From the DELIVER events we gather both the counts of the number and bytes of emails received internally and externally, and we gather the stats for the number of internal emails sent by each user. DELIVER events happen after distribution list expansion and address resolution, so now we have an accurate source of exactly how many emails and bytes delivered that email became, and all the counts will be associated with the users’ primary smtp address, even if they were originally received externally, to a secondary address.
One other thing we need to determine about the DELIVER events is whether the email being delivered is an internal email or an external email. We’re gathering the totals of the internal emails our Exchange users are sending from these events, but there will also be DELIVER events from external emails, so we need to differentiate the emails that were sent internally from the ones that were received from outside, external to Exchange.
We could look at the domain of the sender, and compare that to our accepted domain list, but that’s not necessarily a good test. Many organizations have systems outside of Exchange that will send email to Exchange recipients using addresses that are in Exchange’s accepted domains list. Fortunately, there’s a more reliable way to test for internal email.
Every email has a messageid. Normally, it will be unique character string, followed by @ and the name of the mail server that originated the email. Exchange emails conform to this standard.
That means that every internal email will have a messagid that ends in the fqdn of one of the Mailbox Servers in the organization, and any email that does not originated from outside of Exchange.
Update: in a perfect world it would. I’ve just found and tracked down some anomalies in the reports from the script. I had one instance from Gmail, and one from Yahoo that I cannot recreate, but it appears that ATT will send any reply to an email sent through their SMS gateway (txt.att.net) with the messageid of the original message that was replied to. This causes messages to appear in the logs from external (@txt.att.net) senders, with messageids that appear to have originated from the Exchange mailbox servers. Grrrr….
So, we need to check that message id, and if it’s an internal message each delivery gets attributed to the sender as a internal email sent, along with being attributed to the recipient as email received.
Now we know where it is, so the only thing left is how to get it.
I’ll leave that for another day, when I’ll dig into the code that makes it happen.