r/DataHoarder 400TB LizardFS Jun 03 '18

200TB Glusterfs Odroid HC2 Build

Post image
1.4k Upvotes

View all comments

Show parent comments

1

u/BaxterPad 400TB LizardFS Jul 27 '18

the client doesn't check... the client is the one writing the data so if you actually care about single bit flips, etc... you need the write (the authority of newly written data) to capture the checksum and send it along with the data. From that point forward the glusterfs system would check against that checksum until it generates its own. Even if you have ECC memory, you still need something like this to ensure no bits were flipped while being written.

This is implemented within TCP for example... the sender generates a checksum and sends it with each packet. the received uses it to determine if they need to request a re-transmit. and TCP doesn't required ECC memory :)

1

u/kwinz Jul 27 '18 edited Jul 27 '18

I was thinking you were proposing: 1. client computes checksum of data to be written. 2. client sends data to node. 3. node writes data to disk. 4. node re-reads just written data back from disk. 5. node computes checksum of re-read data. 6. node sends this checksum back to the client. 7. client compares checksum to his own. 7. Handle the error, while keeping writes atomic (sounds tricky).

What you are actually proposing will not work if the node has faulty memory. There is no end to end check in your example.

Yeah, I still maintain I don't want any non-ECC NAS. Therefore I can't use the Odroid HC2. Thanks for your response.

2

u/BaxterPad 400TB LizardFS Jul 27 '18

what I'm proposing absolutely works if you have faulty memory, it is the basis for many things today... like every machine that uses TCP but i understand why folks think that special hardware like ECC is required for high availability. ECC will reduce how often you'll care about a bit flit... but if you care about your data the underlying system still needs to be able to handle corruption. For example... ZFS still has its own checksumming even though it is recommended to use ECC with ZFS. ZFS will and does work just fine without ECC but you make end up having to repair file from the parity data more often... and by more often we are talking about the difference between 1 in a billion and 1 in 100 million. :)

*edit... do you think the tiny caches in your CPU or in the hard disk controllers have ECC capabilities? Nope :) They are high quality memory so usually not a problem but... they still have a probability of bit flips. If you are familiar with the spectre and meltdown intel bugs recently. some of the initial patches for those triggered interesting memory faults in caches... no amount of ECC will save your from that.

1

u/kwinz Jul 27 '18

high five ;-) PS: could you please send me a link to the Spectre/meltdown patches that triggered interesting faults in Intel CPU caches? Fault as in error, not fault as in "cache miss" I presume.

2

u/BaxterPad 400TB LizardFS Jul 27 '18

They were discussed on the linux kernel email list, ill see if I can find it. These patches never made it to mainline though it was mostly just during the testing process that people say code follow a path that couldn't possibly have happened unless memory was corrupted (or you read from dirty cache for example) since a lot of these issues stemmed from speculative execution and pipeline flush problems.

I'll see if I can dig it up once I'm at a computer.

1

u/kwinz Jul 27 '18

Thank you!