Small File Synching still slow


ChristianO

Recommended Posts

Hello there, I only found some really old thread on this (2014 )

It seems Resilio in 2021 is still synching slow if there is a  lot of small files. i.e. in my Example i am trying to sync a photolibary. Adding a few hundred files is without trouble...

But now adding the complete Library to a new device means Transferring 250 GiB/ 50K files at and average Speed of 500kb/s. Means I will have to let the laptop and PC on for multiple weeks during daytime!

 

It appears that Resilio is sending files sequentially. And basically no matter how fast sequential sending of small files is. Any protocol will incure you an overhead of 2-3 RTTs. So this will  always have the RTT of the protocol in the network and disc access as hindrance no matter my AC WLAN or disc speed. So only way to speed up would be to allow multiple such small files to be transferred in parallel (simple and works kind of okayish) or  that the software zips them together and transfers them as one big batchfile (more performance, but harder to implement).

Would be nice if Resilio could improve this! Or if this is already present and I fail to find the setting for it.. please point it out to me!

Link to comment
Share on other sites

  • 3 weeks later...

Resilio should already be sending multiple files at once. However, I don't think 2.7.2 behaviour is the same as it has been before, or in the intended bit torrent protocol. I recall seeing them publish a white paper showing how pushing out a file to like 250 business users happened so much quicker. I have not observed what they claim at all in that the clients do not intelligently split up the files and share different parts to different clients for there to be any observable speed increase.

My home connection is 600Mb down, 16Mbps up.  When I only synced with one server, I would only get the file sent to me at like 40Mbps.  I wanted more senders to max out my 600Mbps download at home.  I setup multiple servers in the cloud that all have gigabit (or 10gigabit) up/down connections. I was expecting that having 5 fast servers (in closer geographical area) would sync the file faster amongst themselves and then maybe max out my 600Mbps connection by having 5 senders instead of 1.  Instead, I found my home computer constantly maxing out the 16Mbps uplink at my home to share the files with the cloud servers that are closer to each other all with higher bandwidth connections.  It ended up being slower, creating bottlenecks and running into bug after bug.

I think its one of these things where Resilio employees do not use their own product, or else they'd know that scenario runs into several bugs that ruin any bit torrent efficiency. To be fair, this would take quite a bit of QA resources to setup and automate, but really they should have a test bed for automated QA and I highly doubt they have that.

If you're only doing syncing between two peers, you should just use rsync for 10X the speed (especially with the latest rsync that made HUGE improvements), at least to transfer a large amount of files initially.  Just let Resilio do the automatic, smaller transfers once everything is in sync.

Large amounts of small files put stress on the CPU and the I/O out of the PC.   Your point about RTT is kind of irrelevant (the latency itself is more critical than the RTT due to Bandwidth Product Delay). All packets are going to be sent at maximum packet size (ie, 1500 MTU).  Doing syncing of large amount of small files over wireless is a poor setup already because of the added latency, shared medium, packet loss and retries associated with wifi and will only exacerbate the problem.  But you are right that they do need to be sending files in parallel in order to maximize the TCP connection. For example, run an iperf3 using one tcp session and run iperf3 with 4+ parallel sessions and you'll see the latter result in higher throughput, almost maxing out the connection where the single TCP session does not.

The Resilio team seems young and likely could be doing more multithreading to improve performance.  I've got tons of horsepower between Threadrippers, NVMe storage and 10GBe networks and Resilio is clearly more for convenience than performance.  There are some settings in the advanced power settings that may help performance (see *_workers_* related settings), but I have yet to actually notice any improvement on the few times I tried. Ideally, those settings would dynamically adjust according to CPU cores, available RAM, etc (seeing how this runs on lowly single core 512MB RAM NAS systems to 32 core 64+GB RAM systems), but again, takes lots of QA and testing to know that kind of stuff.

Maybe if Resilio gets some investment and development continues, there might be some improvement on the encryption by taking some inspiration from Wireguard or something. I don't know, I'm just guessing that encryption is just one of the bottlenecks that currently limits performance.

tl;dr use rsync to do initial transfer on large amounts of small files.

Link to comment
Share on other sites

You can try restarting Resilio periodically. I just ran into a bug where a transfer starts out at 45MB/s, and then after about a minute it drops down to 5-10MB/s.  Restart the receiving Resilio service and get the 45MB/s again and drops back down to 9MB/s... pattern repeats.

The saw tooth throughput bug has been reported many, many times before.  I might have to start regression testing to see what the last version of Resilio that wasn't this buggy.

The difference between 45MB/s and 5-10MB/s was like 23minutes vs 2-8 hours.  And these are all mostly large files, I can only see this being much worse with small files.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.