Protection Against Silent Data Corruption / Bit Rot


ChrisH

Recommended Posts

Inspired by http://forum.bittorrent.com/topic/30920-active-distribuited-data-integrity-protection-raid-and-other-backup-obsolete/ but he explicitly stated there that this was not what he wanted:

 

BTSync could periodically re-hash the files in the shared folders. If the current hash differs from the hash in the database, but the file change date is still the original, this indicates a corrupted file (i.e. a bit flipped somewhere in the storage subsystem). Files found corrupted should then be re-synced from any online peer, provided that their hash still matches the original.

 

This should be a per-folder option. The re-hash interval can be configured and set quite high, like maybe once a month or so.

Use case: Archived files that are almost exclusively read-only (photos, movies, MP3s, ISOs) but are kept on disk for months and years, thus having a bigger risk of bit rot.

 

http://en.wikipedia.org/wiki/Data_rot

Link to comment
Share on other sites

I also think this would be a really cool feature, especially since many of the items I've synced are exactly that - photos, movies, music, disk images, etc - that rarely change.  Due to cpu usage, I'd prefer a user-settable range, and I'd probably choose to do re-check every 2-3 months (probably up to 6 months, depending on the exact content).

 

I would also agree that it would be best per-folder, so that you could re-check important files more often, as well as not having to suddenly re-hash several TB of data at once. 

Link to comment
Share on other sites

Good to see you finally understand my feature request, do not need to duplicate.

No, I still don't understand what you want and frankly I don't care anymore.

 

Now you understand too why it's so important to keep file versions all along. 

File versions have nothing to do with this request.

Link to comment
Share on other sites

@AcostaJA - personal attacks against other contributors to these forums is not tolerated.

 

You feel that @ChrisH's suggestion is identical to your own, he feels differently. You are both entitled and welcome to contribute to these forums.

 

If moderators/administrators detect threads in this Feature Request forum that are identical or near identical to an existing thread, they may be merged accordingly.

 

@ChrisH specifically "forked" your original thread as he felt that his suggestion was distinctly different from your own.

 

Both of your contributions are welcome, and If we feel it necessary to merge them at some stage we will.

 

However, if they continue to descend into personal attacks against each other/each other's posts, they risk both being deleted all together.

Link to comment
Share on other sites

@AcostaJA - personal attacks against other contributors to these forums is not tolerated.

You feel that @ChrisH's suggestion is identical to your own, he feels differently. You are both entitled and welcome to contribute to these forums.

If moderators/administrators detect threads in this Feature Request forum that are deemed identical or near identical to an existing thread, they may be merged accordingly.

@ChrisH specifically "forked" your original thread as he felt that his suggestion was distinctly different from your own.

Both of your contributions are welcome, and If we feel the need to merge them at some stage we will.

However, if they continue to descend into personal attacks against each other/each other's posts, they risk both being deleted all together.

I don't want to start a battle here, it's good to read moderator are not absent.
Link to comment
Share on other sites

  • 6 months later...

Getting back to the feature request, is there any chance the BitTorrent Sync could propagate bit rot to other clients?

 

I have a ZFS NAS where I store most things, so I feel safe that it is protected against bit rot. But I want to sync some of its files to an iMac and a MacBook, which run JHFS+ file systems, which are susceptible to bit rot.

 

If my iMac or MacBook do suffer from bit rot, is there any way that BitTorrent Sync will detect that as a legitimate change and sync the corrupted file it back to the ZFS NAS and all other devices, thereby negating the bit rot protection provided by ZFS?

 

Link to comment
Share on other sites

Yes, and this is one reason why no synchronization solution should be seen as a backup solution. In your case one approach would be to backup the most important content from your NAS to another bit-rot resistant destination.

 

Note that Sync will not spontaneously propagate bit rot, it will only happen if you modify a file and trigger Sync to transfer it.

Link to comment
Share on other sites

I don't use Sync as backup. I have CCC to an external bootable HDD and Arq (to the ZFS NAS) for that. But if Sync will propagate bitrot to the ZFS NAS, because ultimately all the machines I actually use daily don't have any protection against bitrot, then it's still a manual process of discovering the rot to begin with, and then searching incremental backups for the last good copy.

 

Doesn't Sync periodically index all its shares to determine if it is missing any files from the pool and sync them? I seem to recall with BitTorrent in general, if you modified some local data the torrent app would detect the block was not right and re-download it?

 

If during a reindex, Sync detected a mismatch, how would it know whether or not to propagate the change to other clients, or re-download the file from other clients?

 

For example, say I shut down Sync on one device, modified a file but left its file size and mtime the same. When I started Sync, I assume it would re-index all the shares. Would it "fix" my altered file by replacing it with data from other devices, or distribute the modified file to other devices?

Link to comment
Share on other sites

Glad to hear you don't consider Sync to be a backup solution, my comment about that was meant to be general, not accusatory.

 

As with most synchronization and backup solutions, when Sync periodically indexes a share, it only looks for metadata changes (times, sizes). Anything else would be too resource intensive. Sync only looks at the content when it's actually preparing to propagate changes. As long as bit rot doesn't affect the metadata, it won't propagate spontaneously.

 

There are other solutions to the bit-rot problem involving the periodic generation of hashes on filesystem content - which is what ZFS has built-in.

Link to comment
Share on other sites

  • 3 weeks later...
  • 3 months later...

This is quite an important feature. I would not feel comfortable synchronising all that data across devices if a bit rot error were to propagate.

 

Even I would like to get in the source and make a fork just for this, but there is no source code as far as I know, only dev api.

Link to comment
Share on other sites

  • 1 year later...

i know this thread has been dead for a while but i'd like to revive it. it would be a game changer if resilio sync incorporated a checksumming feature. right now the only way to ensure bit rot doesn't propagate is to have every device connected running a file system with built in redundancy (ZFS, BTRFS, ReFS). or some third party checksumming tool. while this is tolerable on personal machines that you oversee, its much more difficult to control on family, friends, and colleagues devices. even if we do all the heavy lifting of setting up a resilient file system it can all be undone a corrupted file on a peer.

 

this could be solved by an optional checksum option ("file integrity check") in resilio sync. so if a file is corrupted it can get fixed by a peer with the correct copy. a checksum file would be very small in comparison to the actual data,

Link to comment
Share on other sites

  • 2 years later...

I don't get it...is nobody interested in this feature? I think it is THE most important feature. Without a checksum control the complete resilio sync system is useless. One file is broken and it is getting synced accross all connected devices. Ok you could use Version Control for it, theoretically...But then you have to disable the delete time. And this ends in various versions. Then you have to manually delete older versions. But before that you have to check if one file was broken, because if you delete an older version and you find out after some time, damnit this file is broken and I dont have an older version file not good....do that for a thousand files. Forget it. I want a sync system that I can trust 100%, and that is not producing useless files. If I remember it correctly the official answer was something like yeah the Filesystem must implement checksumcontrol. Problem is there are no FS for Windows that have checksum control implemented. The One MS has developed so far I forgot the name can not be used in WIndows 10. And for Android its the same there is no FS with checksumcontrol. ZFS and BTRFS are the only FS that I know of, that have this feature. And they are just for Linx. Could some one of the official devs of resilio sync explain me, why they haven't implemented this feature so far? Thanks a lot. I hope someone is answering.

Link to comment
Share on other sites

  • 4 months later...

I guess it could be useful to have it built in.

When I have synced data which is archived, I create PAR set using MultiPAR

That way I can crc check and fix any inconsistencies that are found. No good for files that dynamically change  but in gives me some comfort that the data is still good, wherever is is synced to

 

 

Link to comment
Share on other sites

  • 3 months later...
On 8/8/2014 at 3:01 PM, piotrnik said:

really cool feature, especially since many of the items I've synced are exactly that - photos, movies, music, disk images, etc - that rarely change

Same to me for finished projects, photos, movies, music, software packages, etc.

Link to comment
Share on other sites

  • 2 weeks later...

Another thought, what Resilio (and even ZFS) can do on top of what is already done by modern hard drives and ECC? All uncorrectable hard drive errors are already reported by OS, like from 1960s, no?

Are you sure you are experiencing bit rot, not some other software bugs working with your data?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.