Linux Processes Stop Running, Then Overwrite New With Old When Restarted


JimmyTheSaint

Recommended Posts

btsync stopped running on both of my Linux boxes within 20 minutes of starting them and syncing. This has happened a couple of times. Before I manually restart them and restore overwritten files, what diagnostic info do you want, exactly? There's nothing relevant to the btsync process in /var/log/messages. And bin/.sync/sync.log doesn't show anything unseemly. It just ends after a bunch of "incoming connection" entries, which is 6 minutes after the last file's piece was completed.

Link to post
Share on other sites

Jimmy,

 

We need BTSync debug log (sync.log and sync.log.old), preferably both set to debug mode prior issue reproduction, from 2 peers. Also, closing btsync indicates that it crashes, so we need core dumps. 

Core dumps must be also enabled prior to reproduciton by command in terminal "ulimit -c unlimited", btsync then must be started in the same terminal window.

Link to post
Share on other sites

OK, I've started btsync on both Linux boxes with all debug information enabled. When one dies, I'll post here. I guess the best way to provide those large files will be via a sync folder.

 

I should mention that when I restarted the Linux clients, they began overwriting new with old, as expected. What's more, if I copy the newer versions on top of the older versions, this time one of the Linux clients just keeps re-overwriting them. When I simply delete the clobbered files (planning to add them back later from saved copies), that same Linux box just keeps adding them back.


about 15 minutes later: One of my Windows clients just crashed. In all the time I've been using btsync, I've never seen it crash on a Windows client. I selected the option to send crash info to the developers. After restarting it, it syncs without overwriting any new with old, but also I haven't been changing any synced files because of this problem.

 

about 30 minutes later: One Linux client's btsync stopped running, but there is no core in the bin directory. I PM'ed you log location info.

Link to post
Share on other sites

Eleven hours later, and the fix is still working. But I just want to add that the same files the Linux devices kept adding back are now being added back by my Note 2. It's not overwriting because the originals have moved. So when the Note 2 adds them back, I delete them. On each round, the number of files it adds back become fewer and fewer, and now it's down to one. This is the same behavior I saw on my Linux boxes before the fix. Meanwhile, my S4 hasn't shown any problems.

Link to post
Share on other sites

Hello,

Similar problem with the 1.3.80 release. Connecting it off after about 10 minutes with the following message in the logs:

[20140404 15:03:38.474] Failed file save: /srv/btsync/sync.dat.new

or :

[20140404 15:12:37.127] SyncFilesController: Got 22 files from remote (192.168.5.20:54108)[20140404 15:12:37.127] SyncFilesController: mutex file check failed[20140404 15:12:37.127] State sync finished for folder /srv/backups/test2[20140404 15:12:47.229] Incoming connection from 192.168.5.20:54108[20140404 15:12:47.315] Got id message from peer TiFred (Xxxxxxxxxxxxxx) 1.3.80[20140404 15:12:47.320] Got state sync request from peer Xxxxxxxxxxxxxx[20140404 15:12:47.329] Merge: processing get_root message, my hash: Xxxxxxxxxxxxxx[20140404 15:12:47.390] Merge: processing get_nodes message for /......[20140404 15:12:48.036] Merge: processing get_files message with 22 paths[20140404 15:12:48.037] Merge: will send files for /partage copie 2......[20140404 15:12:48.201] SyncFilesController: Got 22 files from remote (192.168.5.20:54108)[20140404 15:12:48.201] SyncFilesController: mutex file check failed[20140404 15:12:48.202] State sync finished for folder /srv/backups/test2

Thx


[20140404 15:17:32.871] Changing IP address from 192.168.8.250 to 0.0.0.0[20140404 15:21:23.056] TorrentFile: unloading torrent by timeout[20140404 15:25:41.187] Failed file save: /c/btsync/sync.dat.new[20140404 15:35:42.243] Failed file save: /c/btsync/sync.dat.new

and CPU panic 100%

Link to post
Share on other sites

thelinuxfr,

 

Yes. First of all - try starting btsync with --nodaemon switch and see what will it say to the console. Also,m make sure that your terminal has ulimit -c unlimited and make a core dump of the process when it starts consuming 100% CPU time and not responsive.

Link to post
Share on other sites

thelinuxfr,

 

Thanks, i've got debugs. I'll let you know analysis results.


thelinuxfr,

 

There some strange records in your log.

- "mutex check failed" indicates that the SyncID file was not found in the directory "/srv/backups/test2" - or BTSync failed to get access there. Please make sure that you did not delete this file and the user running btsync has enough permissions to read and write to the folder.

- "Changing IP address from 192.168.8.250 to 0.0.0.0" indicates that you no longer has network. Obviously, btsync won't be able to sync when net is down.

Link to post
Share on other sites

Archived

This topic is now archived and is closed to further replies.