Sep 072009

ECC Memory

ECC stands for Error Checking and Correction. ECC memory is widely used in workstation and server computers.

What Is ECC and How Does It Work?

As the name “Error Checking and Correction” suggests, ECC is technology that allows computers to correct memory errors. The most popular type of ECC used in memory modules is single bit error correction. This enables the detection and correction of single-bit errors (within a byte, or 8bits of data). It will also detect two-bit and some multiple bit errors, but is unable to correct them.

How dos ECC work? Take the most common single bit error correction for example. For each byte of data sent across the memory bus, a check-bit is generated by calculating that byte of data using an Exclusive OR algorithm. This check-bit will be stored in a separate memory chip. That is why memory modules with ECC capabilities sport 9 memory chips on each side, rather than the 8 chips per side we often see with non-ECC memory modules.

The system will use the check-bit to check if the data is correct, and correct the single-bit error if there is one. The check-bit will be transferred together with the original byte of data. Therefore, the ECC memory bus is 72-bit wide as opposed to the 64-bit non-ECC memory. Remember only 64 bits out of the 72 bits of data are counted for bandwidth and application usage, the other 8 bits are all check-bits, so the effective bandwidth of ECC and non-ECC memory is identical.

Do I Need to Get ECC Memory?

To answer the question above, we have to figure out where memory errors come from first. There are two major causes of the so-called “soft” errors:

  • naturally occurring radioactive isotopes (which emit alpha particles), and
  • high energy cosmic rays from supernovas

Both of these events can change the value of data stored in a memory chip. These errors are called “soft” errors because they can be repaired by correcting the value of the memory bit, which is exactly what ECC does.

Chances for a single-bit soft error occurring are about once per 1GB of memory per month of uninterrupted operation. Since most desktop computers do not run 24 hours a day, the chances are not actually that high. For example, if your computer (with 1GB of memory) runs 4 hours a day, the chances of a single-bit soft error happening (when your system is running) is about once every six months. Even should an error occur, it won’t be a big issue for most users as the error bit may not even be accessed at that time. Should the system access the error bit, this little error won’t result in a disaster either – the system may crash, but a restart of the system will fix that. That’s why ECC memory is not a necessity for most home users.

Things are very different when it comes to workstations and servers. To begin with, these systems often utilize multi-gigabytes of memory, and they usually run 24/7 as well. Both of these factors result in increased probability of a soft error. More importantly, an unnoticed error is not tolerable in a mission-critical workstation or server – a system crash is only the smallest of worries. What really matters is the erroneous data itself – you can imagine the issues that can arise as a result of a soft error in bank systems or a flight control computer system. Therefore, ECC memory is definitely required for mission critical applications.

Finally, if you do need ECC memory, you’ll have to buy a motherboard that supports ECC memory modules in addition to the ECC memory modules themselves. Without motherboard support (or memory controller support, to be more accurate), the ECC memory module is effectively the same as non-ECC memory.

Registered/Unbuffered Memory

For starters, registered memory is the counterpart of unbuffered memory. There is never “unregistered” memory, and neither will you find “buffered” memory in contemporary computer systems. Registered memory modules often come with ECC as well, because registered memory is usually applied in servers or high performance workstations where ECC is definitely a necessity.

What’s the Difference between Registered Memory and Unbuffered Memory?

Let’s start from the top. Registers are logic components rather than memory. What they do in a registered memory module is buffer the address and command signals going on to the module. The difference between registered memory and unbuffered memory is whether there are registers on the memory module. The memory controller directly addresses each memory chip on all modules in the system directly in unbuffered memory. In registered memory, the memory controller only sees the register, for which there is one per physical bank of memory.

Why is “unbuffered” the counterpart of “registered”? Buffers are known as “asynchronous” components, which is to say signals on the input pins appear directly on the out put pins. On the contrary, registers are known as “synchronous” components: new signals on the input pins do not show up immediately on the out put pins. Instead, they wait for the next tick of the system clock. There were “buffered” memory modules at the time of the old EDO and Fast Page Mode modules, which were both asynchronous DRAMs.

Who Needs Registered Memory?

Almost all system memory in today’s PCs is unbuffered memory. With increasing system memory, the stability and performance deterioration of memory is inevitable – as mentioned above, the memory controller has to address each memory chip on all modules directly, which results in high electrical loads. To solve this problem, higher density systems use registered memory instead. Registered memory modules contain registers as a buffer to temporarily hold data (address and command data only) for one clock cycle before it is transferred. This increases the reliability of high-speed data access to high density memory but sacrifices some performance since there is one additional clock cycle between the Chip Select and the Bank Activate command.

For a home user, registered memory may not be useful at all – in fact, there is a little performance drop with registered memory. But for those who need to utilize more than 4GB of memory in a system, registered memory is absolutely a must-have. Of course, you’ll have to choose a motherboard that supports registered memory modules as a simple requirement. Also, registered memory is required by some server and workstation motherboards – you don’t really have other choices in this case.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>