is it possible: sync only to and from central server?


Recommended Posts

hi,

i really like btsync as a means to sync files on my machines.

so far i tried

- dropbox: meh, not encrypted, not my "cloud"

- owncloud: client sucks, period

- aerofs: could not get it running, felt sluggish

- btsync: you call it alpha... feels stable, performs :D

so here's my scenario:

I'ld like not to sync from any client to any client but only to and from a central server. this makes the most sense for me as the server has a superior connection and is always up and running whereas clients might not be. Also i want to avoid sending contents from a node to every other node because downloading will be fastest from the central server.

I tried to disable tracker and only use known hosts where, on each client, i entered the central server.

but, using the central server, the clients (all other nodes) find each other and start to sync each other.

Is there any way to prevent this? Is there a way to only have traffic got to and from a central server?

Link to comment
Share on other sites

What you're trying to achieve isn't currently possible - remember, BitTorrent Sync "automatically syncs files between computers via secure, distributed technology."

Therefore, if you have a number of devices all sharing the same "secret", these devices will all communicate with each other - you can't "pick and choose" which devices within a sync connect to which other devices - this would defeat the point of it being "distributed".

Now, you could have a "central"/"master" server with a "read only" secret which would allow content on that device to disseminate to all other devices, and those devices won't transfer anything back to the "central server" - but even then all the other devices would still communicate with each other and transfer content between them!

...so in essence, no, it's not currently possible to achieve what you're after with BitTorrent Sync

Link to comment
Share on other sites

do you think it's a possible feature?

the basis could be to have an option which says: only communicate with known hosts. Thats all it would require.

I understand the initial use case for btsync but its this >< close to be a drop in replacement for many cloud storage and smaller setups in general. it would be the perfect tool :D

Link to comment
Share on other sites

I'm puzzled by your statement that downloading from the central server will be fastest. Even if your server is 10 times faster than any node, it will still be quicker to download from the server *and* some nodes than from the server alone. The only time that isn't true is if the server's bandwidth is greater than your download bandwidth. But in that case it doesn't matter where the data is coming from, you're still getting it as fast as you can.

Link to comment
Share on other sites

Actually, greenone83 isn't completely wrong.

BTSync is based on a bittorrent engine, if it works perfectly a newly created file on one peer will be sent one piece at a time to another peer. This peer will send each piece on to the next peer as it's received, and so forth. This means the total time to send the file is the time it takes the seeder to upload the file plus the time it takes every other peer to upload the last piece to it's next peer.

This is a lot faster than uploading the file to a traditional server then waiting for every peer to download it using the server's bandwidth.

BUT, the initial seeder isn't perfect. It may upload a piece more than once.

To prevent this a mode where one of the peers is the designated seeder would improve speed. Ie if the designated seeder doesn't have a full copy of a file, and you do, you should only send pieces to that peer. If you're a leecher you work as normal, passing pieces to anyone who wants it. If you're the designated seeder, again you work in the normal bittorrent fashion (except perhaps, you're a bit more trusting and never blacklist a peer).

The difference would often be small, but if some of the peers are unstable it could make a huge improvement.

PS: There are of course, 101 other reasons that one particular known peer must have a copy ... it's the only one that always turned on ... it's the one with the tape drive ... it's the office ... etc etc.

Link to comment
Share on other sites

only communicate with known hosts. Thats all it would require.

Well, you can already set BitTorrent Sync to "only communicate with known hosts" by using the "Pre-defined hosts" settings, and unticking Tracker, Relay, and Search options....but even if you specify "pre-defined" hosts, those hosts will still all communicate with each other!

...but its this >< close to be a drop in replacement for many cloud storage and smaller setups in general. it would be the perfect tool :D

BitTorrent Sync isn't a "cloud storage" solution - it's primarily a "file synchronization" tool. At this time, there have been no indications from the developers that this core nature of BitTorrent Sync is likely to change.

Link to comment
Share on other sites

Also i want to avoid sending contents from a node to every other node because downloading will be fastest from the central server.

I think if you use btsync it's even faster when clients download also from other clients! You can maybe use readonly keys so you are sure the files on your server aren't changed

Link to comment
Share on other sites

I'ld like not to sync from any client to any client but only to and from a central server. this makes the most sense for me as the server has a superior connection and is always up and running whereas clients might not be. Also i want to avoid sending contents from a node to every other node because downloading will be fastest from the central server.

The assumptions you're making in regards to this are flawed, because BT Sync is multi point to multi point capable. This means that EVERY system with part (4mbyte) of a file can also spread those parts around (just like regular BitTorrent transfers).

There is actually zero benefit to only downloading files from the central server over from all the member systems because you are also given the member systems' upload bandwidth to your downloads. You're assuming that all downloads of a given file come exclusively from a single source when they start. They don't.

Link to comment
Share on other sites

No, Harold, you don't understand the maths here.

The performance of bittorrent is not because any node can send to any other node.

The same performance can be gotten in a centrally controlled scheme where Node A sends data to Node B who sends it to Node C who sends it ... and so forth. The important point is the 'chunking' or pipelining of the file transfer where Node B starts sending data to Node C as soon as it has any data. It does not wait for Node A to finish sending all the data before it starts sending to the next node.

The multi-point processing is still very important, because it's the way that bittorrent is able to "simulate" this strict "bucket brigade" of data transfer with a random and sometimes uncooperative collection of heterogeneous peers.

This also explains why "super-seeding" is beneficial when starting up a swarm; for most swarms, when there is only one seeder, the upload speed of that seeder is the limiting factor on how fast the data can get into the rest of the peers. If the initial seeder makes sure they only send each block into the swarm once they are acting much more like "Node A" in the perfect scenario above.

The problem with "super-seeding" is that some peers are unreliable and will lose blocks that are sent to them. They not only lose the blocks for themselves but also for the rest of the swarm. OTOH if the initial peer can "super-seed" to a high performance "server" that server will be able to send out more than one copy of every block it receives. This improves reliability and may improve speed if some of the other peers have a poor upload performance.

But, your last paragraph is right on the money. There is no benefit to downloading from a central server, unless, of course, that's where the next block is.

PS: I've actually done that pipelined scheme before by running a pair of "netcat" commands and a "tee" command on each peer. You need a good switch, but when it works it's very very quick. UDPcast is just as quick though, and will work with a hub.

Link to comment
Share on other sites

i disabled everything but search in lan... i presume lan does not include a remote (known) host to further search for other nodes connected to it?

and as far as math goes for _my_ particular scenario:

node_a: adsl 1mbit upload / 50 mbit download; sometimes

node_b: adsl 2mbit upload / 100 mbit download; sometimes

node_c: 100mbit dual; server 24/7

so a usual scenario is this:

node_a provides a new file and now it uploads it to node_b and node_c, it uses its bandwith to provide to nodes with chunks of the file. node_b is a real person so it is interested in the file, node_c ist just a server.

so if node_b was offline node_a would only send to the server node_c and later node_a could fetch the file from there, fast.

but whenever >= 2 nodes are online any of the slow clients unnecessarily again starts use its precious bandwith to also upload to the node who requires a file.

it would be totally possible to build a cluster of known hosts who are servers and help distribute.

but the smaller nodes (clients) should only be concerned with uploading _once_ to the one node and also download preferably only from the fast servers.

so basically i want a star-network no small client-ish node should ever talk directly to any other small client-ish node, instead a fast central server (or bunch of servers) should receive uploads and distribute downloads.

I hope I could make it clear :-)

Link to comment
Share on other sites

feeling like going nuts with trying to reverse engineer the clients behavior :D

is there some explanation which options go together with which, what makes sense and what of the mostly network related options have which effect?

for instance:

whats the relation between use_upnp, lan_use_tcp, search lan and so forth?

if i dont use tracker and dht but _only_ predefined hosts do i need to open additional ports?

on my linux machine with iptables i opened the port i referenced in the config both on tcp and udp... clients don't sync unless i use the tracker option, interesting though that if i watch both logs they both ping each other ... but nothing is synched

any detailed info on those options and iptables/linux/btsync in general?

Link to comment
Share on other sites

If you want a star layout, BTSync is NOT for you.

BTSync is pure mesh.

And a star is just a mesh where some of the links are unused.

If the file happens to be created on the server with very fast uplinks none of the other nodes will send anything to each other; BTSync WILL act like a star network.

This is the advantage of Bram's bittorrent protocol it adapts to what it knows about the peers.

What I (and I suspect now greenone83) am now proposing is to help the protocol by giving it the very important piece of information that it would be a really, really good idea for the big fast reliable node (with a cheap upstream) at IP:Port to have a copy of the file.

This doesn't change how the rest of the network works. This doesn't impact the performance UNLESS that node has a worse upstream than you. It just changes the choice of which node to send the next piece to.

If the node is busy a piece would be sent elsewhere, but considering the asymmetrical nature of ADSL lines this is quite unlikely.

Link to comment
Share on other sites

feeling like going nuts

It's not too difficult

  • use_upnp, this is for firewall maintenance when turned on it attempts to tell the firewall that BTSync's random port will want incoming connections. Turn on to improve connection reliability.
  • search_lan, this tells BTSync that it's allowed to send out UDP multicast packets from port 3838 to port 3838. These packets will only ever traverse nearby networks, probably just the local LAN. It is not required for these packets to go past any firewall. Only one side needs this turned on for it to work.
  • use_dht, send BTSync's random port, IP and info hash to the DHT network. Any (public) peer may search for the info hash and receive your IP and random port.
  • use_tracker, send BTSync's random port, IP and info hash to the hard coded tracker. Any (public) peer may search for the info hash and receive your IP and random port.
  • use_relay, the peer is allowed to make indirect connections via central relay machines. Only needed if use_upnp and "NAT punching" fail.
  • check_for_updates, allows BTSync to phone home to ask what versions are available.
  • lan_use_tcp, make BTSync use TCP for main data transfer to peers that are designated as LAN. The classification appears to include peers seen using "search_lan" and peers specified in the known peers list. This is a performance enhancing option, it does not remove the need for UDP to successfully connect. The TCP connection uses the same server port number as the UDP random port, the source port is random.
  • lan_encrypt_data, encryption is an expensive operation. This is used to turn off encryption for peers that are classified as lan. I am uncertain if this is the same classification as "lan_use_tcp", but assuming it is data will NOT be encrypted to known hosts if this is turned off.
  • use_known_hosts, known_hosts, these are a list of known hosts BTSync will attempt to connect to these even without any other peer discovery turned on. A UDP connection must be established to allow data transfer. This may require BOTH ends to be configured with the other to provide NAT punching.

The "info hash" is an identifier for the share, it is related to the secret but cannot be used to discover the secret.

Peer exchange; BTSync do not share information about known peers. If a peer cannot be discovered by any of the above methods a connection will not be initiated to it.

So to create a pure star network turn off all the discovery options except for known peers and hard code the server IP:port in the clients. The server must be able to accept unsolicited UDP packets from anywhere, the use_upnp option may help with that, if not the random UDP port will have to be forwarded on the server's firewall.

Turning all these off turns off all network access except for occasional DNS lookups for t.usyncapp.com. and r.usyncapp.com..

PS: I haven't checked but basic security rules would suggest that the relay will have to be "NAT punched", ie both ends will need to tell the relay that the connection is acceptable before any packets will flow. This means that it'll only work well with the tracker or perhaps the DHT.

Link to comment
Share on other sites

  • 4 weeks later...

The only potential part thats missing from the "star-recipe" is that I add the smaller nodes aka ADSL clients to the known hosts on the big node aka server. besides that I had all options as explained and it did not work.

I think that it's LAN/firewall specific:

From my understanding its enough if both clients establish a connection with the server through a dedicated, fixed UPNP Port. The server then would have an established connections to talk to both clients.

So far I allowed UPNP and TCP on the server for the defined port. But that did not seam to be enough.

So does the server have to know the IP/port of the clients too and do they also have to be open to the outside the same way the server is?

Link to comment
Share on other sites

I use netbalancer on windows ... giving me the option to upload to server and spread from there ... use the program and make 2 Rulez .. upload to server ip high ... second rule - deny upload ... works perect and reduces upload traffic about 300 % ... because superseeding is not implemented yet ...

Link to comment
Share on other sites

greenone83,

The central server doesn't have to know anybody's IP'port. It doesn't need any way of finding peers turned on. It will respond and construct a two way link with anyone who talks to it. (and has the secret key).

TCP/IP doesn't help in initiating connections; if you've only allowed TCP/IP on the server's firewall the server will have to hole punch it's firewall. The hole punching requires that both ends initiate the connection at approximately the same time. Which means that to hole punch both ends have to know the IP of the other.

UDP/IP must be open inbound at one end (the "server" end) to remove the need for hole punching.

BTSync does NOT need tcp open on a firewall and will not use it if it is open because the remote host's IP address will not be one of those designated as 'LAN'.

PS: fogbav, your posts have an edit button.

Link to comment
Share on other sites

  • 6 months later...

*bump*

 

since I started this thread there have been a few new versions of btsync and I'ld like to know if its possible now to have a "star"-configuration where only a central node (server) communicates with a numbers of clients.

 

simply put:  if the central node is offline no client can sync, this is what i want to achieve.

 

currently i only use btsync in lan and switch it off when i'm not in my lan to make sure only my clients sync files here. I would like to use it however also to backups files to my server and sync over it but only in a star-configuration.

 

regards 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.