Hi all – I’m looking for a replacement for an antiquated (and slow) nfs configuration that we use for our servers (about 40 total, but several different volumes) and am testing some different software including lsyncd, csync2, syncthing, and btsync. I haven’t seen a lot of lan-only or enterprise threads here but maybe that discussion has been limited moreso between clients and support. I’m hoping you guys could offer some insight. First let me list the things I’m looking for:
100% internal – no access to an external tracker and i want no discovery/broadcast traffic of any sort.
Each sync’d folder would be shared among 4-12 servers across several datacenters (private wan) of varying latencies.
Each synced folder could contain a few dozen or a few hundred files, it’s not determined if we’d be using this for deployment or just for keeping potentially dynamic data in sync.
Each server is considered commodity, aka everyone is equal. No centralized tracker, no master server of any sort.. everything is an equally disposable asset.
I need to be able to add and remove servers at any time.
Some of my concerns:
Race conditions – something I ran into when using the lsyncd + csync2 combo is when a sync is triggered with more than two nodes there was a cascade effect.. Copying files from node 1 -> node 2,3,4 would cause node 2 to sync to node 1,3,4, then node 3 to sync to nodes 1,2,4… etc. this in theory causes data to be distributed quicker but it’s bad because you have more than one server trying to copy a single file to a server. Load increases on all servers and network congestion becomes an issue with large files.
Adding new servers – if I join a new server to the cluster that has no files in a synced folder what happens? Is there ANY chance that the existing servers will experience data loss? I can not risk data loss in any way, shape, or form.
Max files and/or servers before performance degradation?
Would adding additional nodes of higher latency effect the replication speed between the nodes with lower latency?
How are split brain scenarios resolved? For example if you have two datacenters with four nodes each, the wan connection is broken for five minutes, however changes are made on both sides of the wan. When the connection is restored how does btsync deal with that?
How does btsync handle files that are currently being written to? For example a file that is being uploaded or a log file that is continuously being written to for 30 seconds.
I’m sure some of these things have been covered elsewhere in the forum but I figured it might be good to get them all in an “enterprise” thread. Any input at all is appreciated, thanks in advance!