Last week, a bug was discovered in the Windows 7 RTM (build 7600) that was distributed to TechNet and MSDN subscribers. The bug occurs when a user (with administrator privileges) runs the “
chkdsk /r” command on a non-system drive, and causes the following:
- “chkdsk.exe” consumes a lot of memory (over 90%)
- the computer becomes non-responsive and crashes
Unsurprisingly, the news had spread on various forums and blogs very quickly. Some described the bug as a “showstopper”, while others claimed that it could derail the Windows 7 launch. There are also several screen shots that have been posted on the internet.
Microsoft has responded to the issue through the Engineering Windows 7 blog in an article discussing how they deal with bug reports, focusing on the ‘chkdsk.exe’ bug. Microsoft said that they didn’t find any reported crashes of ‘chkdsk’ after looking through crash telemetry and existing bug reports:
We first looked through our crash telemetry (both at the user level and “blue screen” level) and found no reported crashes of chkdsk. We of course look through our existing reports of issues that came up during the development of Windows 7, but we didn’t see anything at all there. We queried the call stacks of existing reported crashes (of all kinds, since this was reported) and we did not find any crashes with chkdsk.exe running while crashing.
Then they had to face the fact that they were unable to reproduce the bug despite all of the tests that they’ve conducted to try to do so:
We then began automated test runs on a broad set of machines—these ran overnight and continued for 2 days. We also saw reports related to a specific hardware configuration, so we set up over 40 machines based on variants of that chipset, driver, and firmware and ran those tests. We were not hitting any crashes (as mentioned, the memory usage was already understood). Because some were saying the machines were non-responsive we also looked for that in manual tests and didn’t see anything. We also broadened this to request globally to Microsoft folks to try things out (we have quite a few unique configs when you think of all of our offices around the world) and so we had several hundred more test runs going. We also had reports of the crash happening when running without any virtual memory—that could be the case, but that would not be an issue with this utility as any program that requests more memory than physically available would cause things to tip over and this configuration is not recommended for general purpose use (and this appears to be the common thread on the small number of non-reproducible crashes).
Meanwhile they tried checking forums and external blogs to gather more information about the bug but failed to find any technical details or a crash dump. Microsoft says it will keep on working on it until they are satisfied that they have systematically ruled out the crash or defined the circumstances where it can happen.
I’ve tried running the command on my F drive (using “
chkdsk /r F:“) and did notice a surge in the memory usage (by “chkdsk.exe), however didn’t witness a crash. Do think this is really a ‘critical’ problem? Or is it just making a storm in a tea cup (in other words, giving it a lot of unnecessary attention)?