Hello fellow selfhoster, I was wondering how important it is to have ECC Memory. I want a server that is really reliable and ECC memory pops up as one of the must haves for reliability. But it seems to me in my research that it is quite expensive to get a setup with ECC memory. How important is ECC memory for a server (I rely on).
So far I have been rocking a Raspberry pi 4 which has ECC memory
If you’re using memory for storage operations, especially for something like ZFS cache, then you ideally want ECC so errors are caught and corrected before they corrupt your data, as a best practice.
In the real world unless you’re buying old servers off ebay that already have it installed the economics don’t make sense for self hosted. The issues are so rare and you should have good backups anyways. I’ve never run into a problem for not using ECC, been self hosting since 2010 and have some ZFS pools nearly that old. I exclusively run on consumer stuff with the exception of HBAs and networking, never had ECC.
My understanding is that as the amount and speed of memory increases, the usefulness of ECC in detecting and preventing the types of errors that can cause a crash or corrupt a file goes up.
But for home use it’s probably more useful to focus on storage redundancy and backups, or a UPS to keep things running during power blips/outages.
Think of it this way: if a cosmic ray happen to land on a silicon cell in your RAM and flip a random bit from 1 to 0, how screwed will you be? If the answer is “meh, I’ll just restart the computer/restore corrupted data from backup” then you probably don’t need it.
For large storage, ECC helps a lot for avoiding storage corruption. In combination with a redundant architecture in zfs it is almost bullet-proof. (Make no mistake, redundant storage is no substitute for backups! You still need those.)
One option is to use comparatively old server hardware. I have some pretty old stuff (around 10 years) that uses DDR3 RAM, which is dirt cheap, even with ECC (somewhere around 1 €/GB). And it will be fast enough by far for most applications. The downside is higher power consumption for the same performance. The Dell T320 I have with eight 3.5" SAS disks and 32 GB RAM uses some 140 W of power, to give you a ballpark figure.
Yea I have been trying to avoid high power consumption as power is quite expensive here. I think for my case non ECC + ZFS + backup will suffice. Thanks!
I don’t believe ECC uses noticeably more power
DDR5 has built in data checking which is ECC without the automatic correction which might be worthwhile depending on your setup.
Your ECC on the pi i believe isn’t for the memory chip but for the on chip die’s cache for ARM.
For me personally, if my racked server supports it, I get ECC. If it doesn’t, I don’t sweat it. Redundance in drives, power, and networking is much more important to me and are order of magnitudes higher chance of failing from my anecdotal experience. If I can save those dollars for another higher probably failure, I do that.
DNS is a lynchpin of my network (and wife approval factor) which I splurge a bit for with physical redundance of an identical mini computer that runs it and fail over to same ip if the first box fails. Those considerations are way before if the server has ECC. Just my $0.02.
Thanks for the feedback! Yea think a ZFS redundancy + Backup will do for my application then. From what I am reading here it is less common than I imagined
It’s extremely common in Enterprise where costs for a 100k+ server isn’t the most expensive part of running, maintaining, servicing said server. If your home lab isn’t practicing 3-2-1 backups (at least three copies of your data, two local (on-site) but on different media/devices, and at least one copy off-site) yet, I’d spend money on that before ECC.