Recommended Posts

Hey all,

Updating to the latest sync has removed my pro license and all of my shares.

I'm running the latest rslsync (2.4.1 (672)) which I updated from 2.4.0 (666) today
I downloaded 2.4.1 from: https://download-cdn.resilio.com/stable/linux-x64/resilio-sync_x64.tar.gz
I am running Linux (Ubuntu 14.04.5 LTS (GNU/Linux 3.16.0-70-generic x86_64))

I paused all of my shares, waited for all syncing to stop, I did a pkill on the process, and then I updated.

After updating, I relaunched via the command line with my previous config (see below: xxx represents removed config data)

{
  "device_name": "xxxxxxx",
  "listening_port" : 0,                       // 0 - randomize port

  "storage_path" : "/home/xxxxxx/.sync",


  "check_for_updates" : true,
  "use_upnp" : true,                              // use UPnP for port mapping

  "download_limit" : 0,
  "upload_limit" : 0,
  "folder_rescan_interval" : 86400,

  "webui" :
  {
    "listen" : "0.0.0.0:8888",
    "login" : "xxxxxx",
    "password" : "xxxxxx"
  }

}

When I re-opened the web interface, it invited me to try pro. I have re-added my license, and this is fine now.
However, what I can not do is see any of my syncs. Obviously the data is there, but the web interface just says "You don't have any folders. Add one and start syncing": tjozJ8u.png

Before the upgrade I had 9 different folders.

Is this a known issue with switching from 2.4.0 to 2.4.1? Is there anything I can do to recover other than to re-add all of my syncs and let them re-index? I can ssh/port forward to get to the web interface on my remote boxes and grab the keys from there, but it would be nice not to have to re-index a couple dozen TB of data.

I have tried reverting back to 2.4.0 but the problem still persists. I tried on a new server (Which I had just finished syncing all my data to) and had the same problem.

Sorry if this has already been posted. I had a cursory glance at the troubleshooting post titles and couldn't see a similar topic.

 

Share this post


Link to post
Share on other sites

there is no issue updating from 2.4.0 -> 2.4.1, you just download the new binary and launch it the way you used to - either with config or not. 

In your case you config was not being utilized. Or this is a different config. 
When you were using 2.4.0 (666), did you also downloaded it from the site and launched, or did you installed it though package manager (apt-get)? 

Share this post


Link to post
Share on other sites

Hi Helen, 

Thanks for your response.

31 minutes ago, Helen said:

there is no issue updating from 2.4.0 -> 2.4.1, you just download the new binary and launch it the way you used to - either with config or not. 

This is exactly what I did, and I had an issue.

When I was using 2.4.0, I had manually downloaded it. The same is true for 2.4.1. I simply replaced the 2.4.0 rslsync binary with the one from the 2.4.1 download.

The config has not changed between versions. I always launch with the same launch script, which looks like this:

$HOME/Programs/rslsync/rslsync --config $HOME/Programs/rslsync/rslsync.conf

I know the config is still being picked up because I am still being prompted for my username and password for the web interface.

There has been no change to my config, no change to how I launch the app, no change to how I upgrade the binary. I have tried this on 3 machines now, and each time it removes my license, removes my shares, and prompts me to "create an identity"

Share this post


Link to post
Share on other sites

If you're sure that this very config is being used (you can double check by looking at the running process in console), then the storage may be not readable/writable for the new binary, check permissions first of all. 
And taking into account that you use environment variable in your script check that it points to same user's home directory. 

Share this post


Link to post
Share on other sites

Hey Helen,

Thanks again for your response :)

The first thing I did was check the permissions of the new binary because literally the only thing that changed was the binary. I was using the same user as always, in the same folder, with the same permissions. The permissions on the 2.4.0 and 2.4.1 binaries are both still -rwxr-xr-x (when I update, I keep a copy of the old binary just in case) as they always have been.

The raid array that my data is synced to has 1.1TB free storage, with no change to permissions. The env variable also hasn't changed - it simply points to my users home directory, which is the only non-root user on the machine. The config permissions have not changed. My home directory partitions across all my servers have between 75 to 130GB of free space. My ~/.sync dir is still there with a load of data, my "~/Resilio Sync" dir is there, and still empty.

Literally the ONLY single difference is the binary that is being launched. And as I've said, I have now tried this across three machines (one of them was Ubuntu 16.04.1 LTS, I think) and have had the exact same issue on each. It would seem to me that there is some fault when starting up for the first time where it cannot find any of the old settings or sync config.

I am now in the process of recovering and re-indexing all of my data. I'm not on a witch hunt or anything. I just want to make sure that this isn't a problem that your other users are going to run in to.

Share this post


Link to post
Share on other sites

before replying to you, I've checked that on my ubuntu (well, 16th, not 14th) and it worked correctly. Actually you can point Sync to any storage and it'll use settings from there (of course, provided they are valid) and open the folders that are kept in that storage. 

So, if you haven't gone far in your recovering, can you please do a few things:
1) first of all send the debug logs from your storage to us: sync.log and all its zip archives from /home/xxxxxx/.sync. 
after that 2) try and manually launch Sync pointing to your previous storage: rslsync --storage /home/xxxxxx/.sync 
 

Share this post


Link to post
Share on other sites

Hey Helen,

I have almost finished re-indexing everything, so I do not want to retry using the --storage argument. I'm almost 100% sure that the sync is still using the /home/xxxxxx/.sync directory, because it is still logging to there.

I have found the logs which I've cut to only include Oct 20 22:12 (5 minutes before I restarted and replaced 2.4.0 with 2.4.1, when everything was working fine) to 09:36 the next day, which is the point where I started to re-add my shares after a second instance where it removed my license.

20161021 09:36:22.860] [OnNotifyFileChange] "/home/xxxxx/.sync/.SyncUser1477038982/devices/CAIURHPGFDXOB7TH7XSMCVLDGIEHBB77/info.dat"
[20161021 09:36:22.860] MD[1B13]: Update default mode
[20161021 09:36:22.860] SyncFolderNotify: SyncFolderNotify: "info.dat", event = "IN_MODIFY"
[20161021 09:36:22.860] MD[1B13]: Loading licenses
[20161021 09:36:22.860] MD[1B13]: [load licenses] Aborted: no identity
[20161021 09:36:22.860] [OnNotifyFileChange] "/home/xxxxx/.sync/.SyncUser1477038982/devices/CAIURHPGFDXOB7TH7XSMCVLDGIEHBB77/info.dat"
[20161021 09:36:22.860] MD[1B13]: Failed to load licenses
[20161021 09:36:22.860] SyncFolderNotify: SyncFolderNotify: "info.dat", event = "IN_CLOSE_WRITE"
[20161021 09:36:22.860] MD[1B13]: Loading folder
[20161021 09:36:22.860] MD[1B13]: [load folders] Aborted: no identity
[20161021 09:36:22.860] MD[1B13]: Failed to load folders
[20161021 09:36:22.860] MD[1B13]: Loading devices
[20161021 09:36:22.860] [OnNotifyFileChange] "/home/xxxxx/.sync/.SyncUser1477038982/devices/CAIURHPGFDXOB7TH7XSMCVLDGIEHBB77/info.dat"
[20161021 09:36:22.860] MD[1B13]: [load devices] no identity
[20161021 09:36:22.860] MD[1B13]: Apply folders changes
[20161021 09:36:22.860] MD[1B13]: [apply folders changes] Aborted: no identity
[20161021 09:36:22.860] SyncFolderNotify: SyncFolderNotify: "folders", event = "IN_CREATE IN_ISDIR"
[20161021 09:36:22.860] MD[1B13]: Failed to apply folder changes
[20161021 09:36:22.860] MD[1B13]: Loading identity

In the logs I can see it transitioning between versions, and a second failure to find the license key.

Where should I send the complete logs?

 

Share this post


Link to post
Share on other sites

So I have around 30 to 35 thousand files under management across 20 synced folders for different data categories. All of these files are binary (video content from GoPro and other raw HD/4K sources) with file sizes averaging between 500mb to 5GB. My working assumption is that each time I sync more than 5 or 6 TB of data from one server to another, the sync.dat file corrupts on BOTH servers when shutting down the sync. The logs are showing that the shutdown was completed successfully, but on restarting I see the message "Unsupported or empty sync.dat file." in the logs (see attached logs (which also references your Jenkins build? Is this a clue?)) and I'm prompted in the web UI to "Create identity," with none of my synced folders or pro licence available.

The config is the same as in the linked post, and has never changed. I can pkill and restart without issue dozens of times if I've not synced TBs of data. The linked post contains the same issue, but upgrading from one version to another was a red herring. It happened after I consolidated two servers down to one (approx 6/7TB from each), shut down the sync, and upgraded. It was the shutting down which resulted in the corrupt sync.dat file and resulted in the loss of my license and list of synced folders.

For non Resilio staff reading this, this does not result in data loss. Don't worry. The folders can be re-added and re-indexed without issue, the process just takes a couple of days for me with the volume of data under management.

I can reliably recreate this across versions 2.4.1 and above (currently on 2.4.4).

I suspect it may also happen if there is a large file a count. A photographer friend of mine has 100,000+ files under management and regularly had the issue. After balling his older pictures into tar archives the issue stopped occurring. He sees the same sync.dat log entry.

I hope this gives you enough information to recreate, and ultimately fix, the issue. If there is any more information I can supply to help, please let me know.

In the mean time, if I am copying large volumes of data, restarting the sync every couple of hours appears to guard against this (but I cannot say for sure).

If you don't think this is a bug (it absolutely is) please let me know what I need to do to convince you. I've hit this several times now, and finally realised the connection between data transfer and this issue. I can record a video of this happening the next time I add large amounts of data to the synced folders, and send you full logs of the process.

[20170213 21:58:41.281] SyncFolderScanner: shut down
[20170213 21:58:41.282] assert failed /home/jenkins/slave-root/workspace/Build-Sync-Manually/SyncWebInterface.cpp:568
[20170213 21:58:41.283] unregister diskio completion queue with id 2
[20170213 21:58:41.283] message thread stop
[20170213 21:58:41.285] unregister diskio completion queue with id 1
[20170213 21:58:41.285] Torrent session shutdown: done waiting
[20170213 21:58:41.315] Closing uTP tracker connection to 173.244.217.42:4000. Error 103
[20170213 21:58:41.319] Closing uTP tracker connection to 209.95.56.60:4000. Error 103
[20170213 21:58:42.256] Network stop status: 1
[20170213 21:58:42.285] diskio thread stop, drive_id = 18446744073709551614
[20170213 21:58:42.285] diskio destroy, drive_id = 18446744073709551614
[20170213 21:58:42.285] diskio_controller destroyed
[20170213 21:58:42.310] Shutdown. Saving config sync.dat
[21:59:00.083] Debug log mask has been set to FFFFFFFF
[21:59:00.083] Features mask has been set to 0
[20170213 21:59:00.086] ZIP: Can't locate [version] in zip, error -100.
[20170213 21:59:00.086] I! Configuration from file "/home/user/Programs/rslsync/rslsync.conf" has been applied
[20170213 21:59:00.087] test sha1: AE5BD8EFEA5322C4D9986D06680A781392F9A642
[20170213 21:59:00.087] test sha2: 630DCD2966C4336691125448BBB25B4FF412A49C732DB2C8ABC1B8581BD710DD
[20170213 21:59:00.088] test aes: 0A940BB5416EF045F1C39458C653EA5A07FEEF74E1D5036E900EEE118E949293
[20170213 21:59:00.088] diskio_controller created
[20170213 21:59:00.089] register diskio completion queue with id 1
[20170213 21:59:00.089] PLC[0x00000000031c6210] binding on 0.0.0.0:38699
[20170213 21:59:00.089] D! Socket::listen[0x00000000031c6210][7] bound listening socket 7 to IP 0.0.0.0:38699
[20170213 21:59:00.089] UDP: bound listening socket 8 to IP 0.0.0.0:38699
[20170213 21:59:00.089] D! Socket::listen[0x00000000031c7370][9] bound listening socket 9 to IP 0.0.0.0:8888
[20170213 21:59:00.091] register diskio completion queue with id 2
[20170213 21:59:00.102] Unsupported or empty sync.dat file
[20170213 21:59:00.102] My PeerID: 10A4D8DB1827C4E0E0A4D929A7C7EEA43DF74972
[20170213 21:59:00.102] LC: LoadLicenses: there is no pro license
[20170213 21:59:00.104] loaded history: 1000 events
[20170213 21:59:00.105] setup socket 15 for local peer discovery for 127.0.0.1: success
[20170213 21:59:00.105] setup socket 18 for local peer discovery for 192.168.1.153: success
[20170213 21:59:00.105] Debug log mask has been set to FFFFFFFF
[20170213 21:59:00.105] Features mask has been set to 0
[20170213 21:59:00.106] Scheduler: Apply global rule, download limit: -1, upload limit: -1
[20170213 21:59:00.106] CoreState: Total memory: 2056216576, used by Sync: 6443008, percentage used: 0.31
[20170213 21:59:00.106] message thread start
[20170213 21:59:03.372] ZIP: Can't locate [css/style.css] in zip, error -100.
[20170213 21:59:03.493] ****** WebUI Added as listener: 0x00007f10a8003290 (count=1)
[20170213 21:59:12.585] NAT-PMP: Unable to map port with NAT-PMP.
[20170213 21:59:13.586] Received shutdown request via signal 15
[20170213 21:59:13.588] saved history: 1000 events

This is related to this post: 

 

Share this post


Link to post
Share on other sites

Hey Helen, thank you for your response.

The storage path is set in the config like this:

"storage_path" : "/home/user/.sync",

I understand this is also the default location. The full config can be found in the linked post.

I kill the process with "pkill rslsync", which sends the kill signal to the process and it gracefully shuts down. I wait for the web UI to time out, and for the hard drive activity indicator light on my array to stop blinking. After those two events I usually give it 10 to 30 seconds before starting the sync again, just in case there is any residual disk activity. The instance in the logs above is about 18 seconds between the final shutdown log entry and the restart (which was probably about 12 seconds after all disk activity had stopped).

There seems to be no difference in the shutdown sequence or timings from a clean shutdown and one that invalidates or corrupts the sync.dat file.

I start the process exactly the same as I always have done, as shown in the linked post. I have a bash script which executes the following:

$HOME/Programs/rslsync/rslsync --config $HOME/Programs/rslsync/rslsync.conf

$HOME/Programs/rslsync/rslsync is the full path to the binary, and $HOME/Programs/rslsync/rslsync.conf is the full path to the config. There is no problem with the launch script or applying the config, as each time the config is applied with no issue reported.

[20170213 21:59:00.086] I! Configuration from file "/home/user/Programs/rslsync/rslsync.conf" has been applied

Is there any more information I can give?

Edited by ThePoet
copy paste error :)

Share this post


Link to post
Share on other sites

in your script is --config $HOME/Programs/rslsync/rslsync.conf, but Sync actually starts with ~/Programs/btsync/btsync.conf config. Am I missing something obvious on how these two configs can be the same? 

list the running processes and see which exactly process is running and which exactly config is being used. 

Somehow I oversaw in the linked forum post , and didn't find the support ticket, did you send those? what's the ticket ID? 

Share this post


Link to post
Share on other sites

Ahh yes, sorry. I was copying from a different machine. The config in the script is correct. One of my syncs (which was originally on btsync) has a path of  ~/Programs/btsync/btsync.conf, my new box is ~ /Programs/rslsync/rslsync.conf - I have edited my post to avoid confusion - a simply copy/paste error :)

It's important not to get hung up on the config because I have been running with this for ages and it is most definitely not the problem.

I didn't submit a support ticket last time, I just sucked it up and re-indexed, as with the other times it has happened. It is not until now that I have noticed the pattern of it happening after transferring large volumes of data.

Share this post


Link to post
Share on other sites

it shall not be related to transferring a lot of files. It's related to saving and using the same settings --> using same storage. 
And  now I start thinking that you've copied the part of the log from another machine trying to troubleshoot Sync startup on this Ubuntu? Please do contact support and submit the debug logs from the machine where this problem happens, together with the right config file. Thank you. 

Share this post


Link to post
Share on other sites

No no no, you are not listening. The above was just a copy and paste error from the wrong box while, ironically, showing you the config was not the problem. I am sorry that I copied my example from my 3rd box and confused you. The configs are 100% correct. If they were not, then the server would never work, and they have been working for years (apart from when this bug happens).

When you say "It shall not be related to transferring a lot of files" can you please tell me what you have done to verify that this is the case? Have you actually run tests with several TB of data? Because transferring a lot of data is what reliably causes it for me. Please don't jump onto the config thing and assume that this is user error just because I pasted the config from the wrong box into this thread :) I 100% use the same config every time because it is launched using the same script.

Box 1: The path to config and the binary is /home/user/Programs/btsync/btsync and btsync.conf

Box 2: The path to config and the binary is /home/user/Programs/btsync/btsync and btsync.conf

Box  3: The path to config and the binary is /home/user/Programs/rslsync/rslsync and rslsync.conf

I just happened to paste the wrong config in my notes above. Each machine has only ONE config. It only writes ONE sync.dat file. They are all writing to the same location. There are no sync.dat files, or any other sync files, outside of the locations where they are supposed to be.

Again, to recreate this issue:

Step 1: Build a new server
Step 2: Install and start Resilio
Step 3: Use the Link Device functionality to apply the license and shares. Select to add shares as Disconnected
Step 4: Connect to some shares which will transfer a few TB of data (In my case, Box 1 transferred 9TB to Box 3. Box 2 transferred 11TB to Box 3 - it took about a 5 days)
Step 5: Once transfer is complete, kill the syncs using pkill and wait for hard drive activity to cease.
Step 6: Restart the syncs the EXACT SAME WAY with the EXACT SAME CONFIG so that everything is EXACTLY THE SAME
Step 7: Observe the "Create Identity" dialogue on the web UI, and warning about unsupported sync.dat in the logs.

It absolutely happens after large data transfers, and I am trying to help you guys improve your product. It is hardly coincidental that it happened to all three of my boxes after the large data transfer at the same time.

"contact support and submit the debug logs from the machine where this problem happens" it has happened to ALL of my machines because my two smaller machines synced data to my new larger machine. Every single one of them failed with the exact same sync.dat error after I killed the sync (pkill, like always) and then restarted it (after waiting for the UI to stop responding, the log to print out the final shutdown statement, and for the hard drive activity to stop, like always).

Where can I submit an official bug report?

Share this post


Link to post
Share on other sites

I too woke up to this issue. I also have 36TB of data I was syncing to a new zpool. Neither machine restarted, neither machine did anything different except be ON all night long syncing. This morning I was presented with a Create Identity screen on my Windows machine. Ubuntu machine seems fine. The person above isn't crazy. You DO have a serious problem with your product. This is actually the second time this has happened and re-indexing all this stuff is not only time consuming but hard on my drives and my zpool. Please, do some research to find out why this happens cause people WILL start moving to something else if this continues. Spent DAYS with sync on pause just so it could index all this now I have to re-do it. I don't have to RESYNC most of it though, as the data still exists in the zpool. But the indexing is the most annoying thing in the world. Please PLEASE fix this.

Share this post


Link to post
Share on other sites
46 minutes ago, Rikumo1978 said:

I too woke up to this issue. I also have 36TB of data I was syncing to a new zpool. Neither machine restarted, neither machine did anything different except be ON all night long syncing. This morning I was presented with a Create Identity screen on my Windows machine. Ubuntu machine seems fine.

I only seem to get this issue when I kill and re-start the sync after syncing huge amounts of data. I'd be interested to know if it happened to your Windows box upon restarting. Also, have you restarted the sync on your Ubuntu machine since syncing large amounts of data?

Share this post


Link to post
Share on other sites
6 hours ago, ThePoet said:

I only seem to get this issue when I kill and re-start the sync after syncing huge amounts of data. I'd be interested to know if it happened to your Windows box upon restarting. Also, have you restarted the sync on your Ubuntu machine since syncing large amounts of data?

I did restart sync on my Windows machine to begin with. Thats when I noticed the issue. But I am afraid of restarting it on my Ubuntu machine lol. I JUST successfully converted over to using a headless Linux box for my Plex needs and I have synced over 20TB so far to my Zpool out of 36TB. Like I said thankfully all the data is still there and in my particular situation it isn't too terribly annoying, but the INDEXING UGGGGGHHH lol. When we are talking about what most people have which I feel is 4 to 8 TB of data it isnt too bad, but huge collections like this it takes an entire day for some folders to re-index again. So, since the bug happened I have re-indexed a couple of the smaller folders and I am moving onto the larger ones now. Anything I can do to help I will gladly do. 

 

So to clarify, both Ubuntu and Windows 10 ran Resilio for 3 solid days with no restarts, and successfully transferred 20TB of data. Then I restarted it on Windows and bam, create Identity..... Ubuntu GUI still shows all folders, just with 0/1 peer online. I erased my old Anime share and recreated it and am re-indexing as we speak. Haven't done anything else.

Share this post


Link to post
Share on other sites
On 2/19/2017 at 10:46 PM, Rikumo1978 said:

Then I restarted it on Windows and bam, create Identity.....

I suppose you didn't quit Sync before that? 
We've been investigating this issue, and on Windows that happens if Sync is forced to quit and does not saves settings. Will be addressed. 

@ThePoet,
Already submitted logs to support? still cannot see your ticket, what's the ticket id? 

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.