Microsoft has published this article to give you an update about Exchange 2010 SP1 Rollup 4.
————————————————————————————————–
The Exchange Sustained Engineering team recently made the decision to recall
the June 22, 2011 release of Exchange 2010 SP1 Rollup 4. This was not an action
we took lightly and we understand how disruptive this was to customers. We would
like to provide you with some details that will give you a deeper understanding
of what actually happened and, more importantly, what improvements we are making
to prevent this in the future.
Q: What actually triggered the recall?
A: While fixing a bug that prevented deleted public folders from being
recovered, we exposed an untested set of conditions with the Outlook client.
When moving or copying a folder, Outlook passes a flag on a remote procedure
call that instructs the Information Store to open deleted items which haven’t
been purged. Our fix inadvertently caused the RPC to skip all content that
wasn’t marked for deletion because we were not expecting this flag on the call
from Outlook on the copy and move operations.
Q: Why didn’t you test this scenario?
A: The short answer is we thought we did. We didn’t realize we missed a key
interaction between Exchange and Outlook. The Exchange team has well over
100,000 automated tests that we use to validate our product before we ship it.
With the richness and number of scenarios and behaviors that Exchange supports,
automated testing is the only scalable solution. We execute these tests in
varying scenarios and conditions repeatedly before we release the software to
our customers. We also supplement these tests with manual validation where
necessary. The downside of our tests is that they primarily exercise the
interfaces we expose and are designed around our specifications. They do test
positive and negative conditions to catch unexpected behavior and we did execute
numerous folder copy and move tests against the modified code which all passed.
What we did not realize is that our tests were not emulating the procedure call
as executed by Outlook.
Q: Exchange has been around a while, why did this happen
now?
A: In Exchange 2010 we introduced a feature called RPC Client
Access. This functionality is responsible for serving as the MAPI endpoint for
Outlook clients. It allowed us to abstract client connections away from the
Information Store (on Mailbox servers) and cause all Outlook clients to connect
to the RPC Client Access service.
As part of our investigation, we discovered that there was some specific code
added to the Exchange 2003 Information Store to handle the procedure call from
Outlook using the extra flag. This code was also carried forward into Exchange
2007. But when the Exchange team added the RPC Client Access service to Exchange
2010, that code was not incorporated into the RPC Client Access service because
it was mistakenly believed to be legacy Outlook behavior that was no longer
required. That, unfortunately, turned out not to be the case. The fact that we
were not allowing a deleted public folder to be recovered was masking this new
bug completely.
Q: Are there other similar issues lurking in RPC Client
Access?
A: We do not believe so. The RPC Client Access functionality has been
well-tested at scale and proven to be reliable for the millions of mailboxes
hosted in on-premises deployment and in our own Office 365 and Live@EDU
services.
Q: What are you doing to prevent similar things from
happening in the future?
A: We have conducted a top-to-bottom review of the process we use to triage,
develop and validate changes for Rollups and Service Packs and are making
several improvements. We have changed the way we evaluate a customer requested
fix to ensure that we more accurately identify the risk and usage scenarios that
must be validated for a given fix. Recognizing the diversity of clients used to
connect to Exchange, we are increasing our client driven test coverage to
broaden the usage patterns validated prior to release. Most notably, we are
working even closer with our counterparts in Outlook to use their automated test
coverage against each of our releases as well. We are also looking to increase
coverage for other clients as well.
————————————————————————————————–


