borahshadow

Do I Need Raid With Bittorrent Sync?

Recommended Posts

I've been very interested in and planning on using BitTorrent Sync to keep my family files synced between our different households.

 

Here is my previous setup.

Central Linux File server with 2 HDDs in RAID 1. I've been using that setup for a number of years now. Starting with the day when I thought that RAID was a backup solution. I am now aware that it is not a good backup solution (at least by its self)

 

Planned setup:

  • Primary server at Location #1 with RAID 1 mirroring (Linux)
  • Secondary 'Server' at Location #2 with a single WD Red drive. Basically a 24/7 workstation (Win 7)
  • Tertiary 'Server' at Location #3. Essentially a low powered NAS device at that location. Investigating a RaspberryPi. This location is planned for but not guaranteed at this point.

 

Those three nodes will sync all the data. There will be several other nodes (laptops, other workstations, smart phones) that have separate sync profiles to sync only a subset of the files (Just pictures or music, or a folder specific to an individual (family member).

 

Main question: If I'm syncing this between three (+) nodes how important is it to have RAID on the primary server?

 

Other question: Is there any problem with my sync strategy? Ie. All data synced between three nodes and then smaller sync profiles to sync specific sub directories with "second level" nodes like laptops or phones? Possible case for sync conflict? Hypothetical: Location #1 which houses the primary server also has User #1 which has a laptop and smart phone. User #1 wants his Music files to sync to his laptop, phone, and server network but doesn't want all the pictures on his laptop or phone due to obvious storage space limitations. There is already a profile on the Primary Server to sync the root directory of the data drive but he sets up a second profile to share the Music directory (sub folder of the Root profile) to share it to his Laptop and Phone. Will any issue arise?

 

PS. I'm aware that just as RAID isn't a backup solution on it's own neither is BitTorrent sync. A corrupted or deleted file (be it by a virus or user it doesn't matter) will propagate across all nodes as well. Just at a slower speed than RAID. So unless an error was detected quickly enough to take a node offline it would still sync. I will be using a second service (such as Crashplan or similar) to make a regular incremental backup from one of the 3 main nodes to a second HDD in one of the other three nodes so that there is a true offsite backup of the data.

 

Share this post


Link to post
Share on other sites

Main question: If I'm syncing this between three (+) nodes how important is it to have RAID on the primary server?

 

Other question: Is there any problem with my sync strategy? Ie. All data synced between three nodes and then smaller sync profiles to sync specific sub directories with "second level" nodes like laptops or phones? Possible case for sync conflict?

 

 

The only way I can thin of to respond to your main question is, if you implemented RAID 1 when you thought it was a backup solution, and you now recognize that it isn't, then after implementing a proper backup solution, maybe you no longer need RAID 1. I'm not sure this has anything to do with whether or not you're using Sync.

 

As for your other question, I seem to remember reading somewhere in these forums that that this is an unsupported configuration and not recommended, but I'm not sure.

 

There is a feature request topic about Selective Sync. Maybe that's what you're looking for.

Edited by trevellyan

Share this post


Link to post
Share on other sites

Well, the core idea and algorithms behind and inside btsync are those of RAID-1.

In storage technology, you have a logical volume above two or three physical disks:

  • write is performed wrinting to any disk the same information and waiting until all disks say "Written!"
  • reading is performed bye reading from the disk that replies the faster
  • name quorum or goal the number of disks (that is the number of distinct copies of your data that exist in the array)

As soon as you share a (new) folder with btsync, you have what is called a degraded RAID-1 array. Think the folder is the single disk above. As soon as another instance of btsync connects to that folder (that is, you run btsync on another computer and add a folder for the key you shared on the first one) that turns the pair of folders into a logical RAID-1 array. You can add new btsync that connect to the same share, that is you can increase the quorum of that share as much as you like (actually there a hardwired limit of 50 directly connected btsync instances).

So, what are (some of) the differences between a RAID controller and btsync?

  • a RAID controller manages data to locally connected disks; btsync manages a (portion) of a local disk
  • a RAID contoller writes data usually in a synchronous way; btsync is completely asynchronous
  • an array created by a RAID controller has a fixed goal/quorum for the data you write in that you set at creation time; btsync has a dynamic quorum, that can go up (when more computers join a share) and can go down (when computers disconnect from a share)
  • a RAID controller performs a full rebuild of the array to get back to the fixed quorum/goal as soon as possible; btsync simply tries to update what changed asking for changes to any other newer copy of the missing/outdated data.

 

In short, as soon as you have enough computers/nodes running btsync and joining the same share to match the quorum that fits your needs and synchronization is effective, you have a geographically distributed RAID array.

Share this post


Link to post
Share on other sites

I guess what I'm asking is somewhat related to a different thread I started but I felt that they were  distinct enough questions to warrant a thread of their own.

 

I recognize that RAID is not a back up solution but an availability and redundancy provider. It appears as if Sync falls into this same category. If I'm syncing two locations I feel that is a decent level of redundancy... But I guess it comes down to how reliable the redundancy is.

 

Eazq. Thank you for that reply. That's what I was getting at. If BtSync behaves as a geographically distributed Redundancy array (of independent machines) Then is having a second redundant array really necessary unless I truly needed to achieve a 100% uptime by having local redundancy.

Share this post


Link to post
Share on other sites

For the backup... if you mean revision of files (history of files, whole life of a file etc.), you should probably put a btsync node over a storage which can do near-CDP, like snapshots in ZFS. FreeNAS, for instance, has official support for btsync. A single FreeNAS node with btsync and a properly set snapshot task will remember for you the history of a file/folder. If THAT node will fail, you won't loose the current contents of the folders, because the quorum is still 2 (the other nodes), but you will loose the whole history of files. Unless you set another FreeNAS box to receive the snaphosts created on the first one: this second ZFS box won't need to run btsync (that is it will be just a slave node to keep the history and won't be part of the replication-set defined by btsync).

Share this post


Link to post
Share on other sites

@borahshadow

One little note. It looks like in your setup you are going to sync whole array of data, while smaller subsets of data will be synced from storage device to local endpoint devices (desktops, laptops, etc.).

 

Note, that if you are going to use Sync to deliver subsets of data to local endpoint devices, you are very likely going to use "Nested folders", i.e. sync folders inside other sync folders. Their usage is limited to read-write keys only, i.e. nested folders does not allow usage of RO key neither for parent nor for child.

Share this post


Link to post
Share on other sites

I've been very interested in and planning on using BitTorrent Sync to keep my family files synced between our different households.

 

Here is my previous setup.

Central Linux File server with 2 HDDs in RAID 1. I've been using that setup for a number of years now. Starting with the day when I thought that RAID was a backup solution. I am now aware that it is not a good backup solution (at least by its self)

 

Planned setup:

  • Primary server at Location #1 with RAID 1 mirroring (Linux)
  • Secondary 'Server' at Location #2 with a single WD Red drive. Basically a 24/7 workstation (Win 7)
  • Tertiary 'Server' at Location #3. Essentially a low powered NAS device at that location. Investigating a RaspberryPi. This location is planned for but not guaranteed at this point.

 

Those three nodes will sync all the data. There will be several other nodes (laptops, other workstations, smart phones) that have separate sync profiles to sync only a subset of the files (Just pictures or music, or a folder specific to an individual (family member).

 

Main question: If I'm syncing this between three (+) nodes how important is it to have RAID on the primary server?

 

Other question: Is there any problem with my sync strategy? Ie. All data synced between three nodes and then smaller sync profiles to sync specific sub directories with "second level" nodes like laptops or phones? Possible case for sync conflict? Hypothetical: Location #1 which houses the primary server also has User #1 which has a laptop and smart phone. User #1 wants his Music files to sync to his laptop, phone, and server network but doesn't want all the pictures on his laptop or phone due to obvious storage space limitations. There is already a profile on the Primary Server to sync the root directory of the data drive but he sets up a second profile to share the Music directory (sub folder of the Root profile) to share it to his Laptop and Phone. Will any issue arise?

 

PS. I'm aware that just as RAID isn't a backup solution on it's own neither is BitTorrent sync. A corrupted or deleted file (be it by a virus or user it doesn't matter) will propagate across all nodes as well. Just at a slower speed than RAID. So unless an error was detected quickly enough to take a node offline it would still sync. I will be using a second service (such as Crashplan or similar) to make a regular incremental backup from one of the 3 main nodes to a second HDD in one of the other three nodes so that there is a true offsite backup of the data.

 

I am considering doing the same thing. Building a high quality machine (FreeNAS with ECC ram) for primary use, and then low cost FreeBSD machines to be placed at houses of family members. Will enable DLNA and SAMBA so that we can all drop content into the swarm, and view each others stuff.

 

But what if my nephew deletes the folders?  So I will have a fat drive (USA or network, but no btsync) and do a manual copy every so often, and take that drive off site to a freinds house or something.

Share this post


Link to post
Share on other sites

@borahshadow

One little note. It looks like in your setup you are going to sync whole array of data, while smaller subsets of data will be synced from storage device to local endpoint devices (desktops, laptops, etc.).

 

Note, that if you are going to use Sync to deliver subsets of data to local endpoint devices, you are very likely going to use "Nested folders", i.e. sync folders inside other sync folders. Their usage is limited to read-write keys only, i.e. nested folders does not allow usage of RO key neither for parent nor for child.

 

@borahshadow

For experiences sharing of mobile(Android only), it is 16G only on my hand now. It is easy to over the memory limited after take photos and videos by my mobile, They all will sync to my Ubuntu Server, but i need to move the photos and videos(already btsync to my server) to another folder, then release the mobile memory after re-btsync.

 

I now try to write a code a my ubuntu server for auto. move the data from my mobile sync folder to another folder after detecting a 80% full of my mobile, then auto. clean the mobile by re-btsync. So it will like the mobile always has memory for my photos and videos.

 

@RomanZ

At beginning, i know btsync does not support "Nested Sync Folders", in my understanding of your above wording. it is support now but RW only. It is very helpful for sub-folder which sync to mobile(which is limited memory, such as: 16G), but a larger folder(including this sub-folder) sync to my notebook. So inside on Tree of sync folder can include the notebook data and mobile. Before i need to have separated two tree of sync folders, which separated my data tree i do not like.

Share this post


Link to post
Share on other sites

@chungyan5

There is limited support for nested folders. They are allowed only if both nested and parent are RW and technically are treated by Sync as completely different folders.

Share this post


Link to post
Share on other sites

If nested folders are treated as separate folders then why couldn't I just set up different sync profiles for the same folder? If I had main documents folder synced between my laptop, server, and other servers and then wanted the photos and music sub directories of that folder to sync to my phone why couldn't I just set up a sync profile on my laptop of the music folder and share it to the mobile?

Share this post


Link to post
Share on other sites

by "Sync Profile" I just mean a folder that is set up to sync. 

 

I just tried a simple experiment and on my computer I have a Music folder set up to sync. I just clicked add folder and selected a sub-folder of Music and it let me add it to the list of folders including adding it as R/Only

Share this post


Link to post
Share on other sites

Raid 1 will improve you read abilities for that 1 server and protecting your from 1 hard drive that is about it.

If you had a secondary desktop/computer in location 1. I would remove the raid array and take the extra hard drive of that raid array add it to the secondary computer, and add that computer to your BitTorrent sync backup. This will be a casual server, it doesn't have to be on 24/7 but when it is, it can receive all of your backups. from you main server over the LAN that is much faster then your internet. 

 

This will give you the advantage of,
if anything else happens to your main server. (Power supply, motherboard, ram, harddrive dies) You still have your secondary computer. 

Secondary desktop has the entire sync on the local hard drive for easy fast access.

You local devices may be faster at copying with two servers, but I don't think you will notice.

Oyes on Location 3, I have heard RaspberryPis do not have enough processing power to handle BitTorrent Sync, especially with an entire backup.

 

You might have better luck with something like this
http://www.newegg.com/Product/Product.aspx?Item=N82E16856501007

 

You still maybe want something with more power but it sounds like your looking for cheap and low power.

Share this post


Link to post
Share on other sites

@borahshadow

Just tried to do the same in Lab - UI does not allow to do it. Do I understand correctly, that you've got a sync folder with RW key, and inside it there is another folder added with RO key?

 

You can copy RO key from nested folder, but it can't have RO key, actually. Can you describe in a very detailed way, step-by-step what were you doing? (or capture a short video)?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.