464 - Ksplice » Attack of the Cosmic Rays! - System administration and software blog

It’s a well-documented fact that RAM in modern computers is susceptible to occasional random bit flips due to various sources of noise, most commonly high-energy cosmic rays. By some estimates, you can even expect error rates as high as one error per 4GB of RAM per day! Many servers these days have ECC RAM, which uses extra bits to store error-correcting codes that let them correct most bit errors, but ECC RAM is still fairly rare in desktops, and unheard-of in laptops. For me, bitflips due to cosmic rays are one of those problems I always assumed happen to “other people”. I also assumed that even if I saw random cosmic-ray bitflips, my computer would probably just crash, and I’d never really be able to tell the difference from some random kernel bug. A few weeks ago, though, I encountered some bizarre behavior on my desktop, that honestly just didn’t make sense. I spent about half an hour digging to discover what had gone wrong, and eventually determined, conclusively, that my problem was a single undetected flipped bit in RAM. I can’t prove whether the problem was due to cosmic rays, bad RAM, or something else, but in any case, I hope you find this story interesting and informative.

2010-06-28 09:42:04

http://blog.ksplice.com/2010/06/attack-of-the-cosmic-rays/


SileX