simondarren

Newer Files Overwritten By Older Files - Btsync 1.3.94

Recommended Posts

I have had this happen to me several times in the last week or two.  I constantly have syncing between two computers for programming.  I have had to manually recover from .SyncArchive on several occasions.  I've started the debug log on both systems and will upload when it happens again.

Share this post


Link to post
Share on other sites

I just got hit again on this. Again, excel files. The !SYNC file remains on the end that saved it. The actual excel file is GONE from that side. The B side has a badly damaged version of the excel file. I can open it but excel spits out all kinds of errors, about white space expected and others errors.

 

I just checked both ends, both on the most recent version of BTS. 

 

The only thing that is unique is this client has decided to put . in the file name area so these excel files are like 8.1.14.xls and such. And for the most part, we're OK but just randomly now, its just imploded and the file gets trashed. On that end, completely disappears.

 

I feel like this has something to do with EXCEL files and perhaps the "." in the file name. 

 

The later, because they open lots of other excel files and of the files they touch every day, both the ones with the "." in the file name this happened to today.

Share this post


Link to post
Share on other sites

@simondarren

 

I hope you've got your debug logs turned on so I can take a look? Also - which version of office do you have and which OS do you use? I'll try to reproduce your issue in the lab.

Share this post


Link to post
Share on other sites

I have the logs turned on this end, teh B side. I'll have to ck the A side, we've not had much issue there before, this is not the same A side as the orig post. I disconnected that as I replaced that machine and have not re-synced yet. 

 

Office 2010, Windows7 Pro.

 

Thanks for cking into it, my plate is really full ATM, I'll do my best to get those logs ASAP.

Share this post


Link to post
Share on other sites

My issue is completely unrelated to Excel files.  Also it isn't related to files with "." in the middle of them.

 

Perhaps 1 theory is that the original syncing machine (that shared the folder) is left turned off for a day or two while making changes on another machine.  When you turn back on the "original" machine it will try to overwrite any files changed on the remote machine even if those files are newer.  Just a theory though, haven't confirmed.

Edited by justinr1234

Share this post


Link to post
Share on other sites

@justinr1234

 

There is a known case of overwriting. Say, you've got Machine A and B. Here is the scenario:

 

1. A shuts down Sync.

2. A changes some files.

3. B has Sync ON.

4. B changes same files as A in #2. This change happens later than #2.

5. Now, when Sync turns ON on Machine A - changes on B are going to be discarded.

 

Please check if this scenario applies to your case.

 

@simondarren

We are trying to reproduce your case in the lab meanwhile.

Share this post


Link to post
Share on other sites

As you know, my case, only one side ever changes any files. I'm really just using this for an offsite backup. 

 

AND. Something just hit us again over night.

 

The orig setup that I created this post on. I had stopped syncing those two. I replaced the computer on the A side. I recreated the sync pair, this time as I said I would I setup a READ ONLY KEY on the B side. 

 

I fired that off 2 nights back. Well last night. BAM, excel file they use every day, is suddenly...Empty. Blank.

 

I go looking around...here's the weird part. A perfectly good copy of the file is in Sync Archive on the "A" side. I thought that is for things deleted by another sync end? Why are there any files in there, the only other sync partner is RO? 

 

And like I said, the file was in its correct folder...just empty. I had a feeling something was up again since they had been operating for weeks no problem (no BTSync) and the day after I fired up BT again, an excel file is trashed. I thought RO would protect me from problems. But given I found the file in Sync Archive, it sure seems to  be BTSync hating on EXCEL files. 

 

Btw, I asked the staff and its possible they leave this file open a lot,  not sure that helps. 

 

Using the READ ONLY key did not stop these EXCEL file issues. 

Share this post


Link to post
Share on other sites

@simondarren

 

Thanks for the detailed description, it helped to identify the root cause. 

 

Technically, RO peer is unable to distribute file modifications / move / removal events that happened on RO peer: it has no private key and cannot sign changes / events. However, RO peer is able to distribute events and changes other peers shared to him.

 

In your case RW peer shared a number events to RO peer (a number - because of MS office save file peculiarities) and they were processed in a wrong order. So the "delete" event came back to RW and removed the file. 

 

Fix for the issue will be available soon in upcoming 1.4 release.

Share this post


Link to post
Share on other sites

@RomanZ

 

That particular case doesn't apply to me.  I am only making changes on the "remote machine" (meaning the machine that did NOT originally share the sync folder).  I haven't set up any sort of read-only situation either.  This "seemed" (again not confirmed) to start around when I was turning off the original machine that shared the folder before bed every night.  Now that I've been leaving it on all day everyday (like I used to) I haven't seen the issue.  I might try an experiment with less delicate files and see if I can reproduce specifically.

Share this post


Link to post
Share on other sites

I've installed the 1.4 version. The issue is still present, the scenario is always the same and it's very simple to reproduce:

 

1. syncronize at least 3 peer

2. turn off pear 3

3. modify and syncronize files between 1-2

4. turn on pear 3 (a day after in my case): this pear updates the files modified in step 3 with its old version

 

I've noticed this behaviour with .xls file, .dat files (txt files used by our system).

As soon as I can I try to make some log.

Share this post


Link to post
Share on other sites

We'll try to reproduce it in our lab.

 

To reproduce such a szenario, this works for me *all the time* without an exception:

(I reference on btsync 1.3 here, but the problem still exists in 1.4!)

 

R/W share with about >300.000 files, spread in about 15.000 folders on a Win7 NTFS Harddrive (or SSD). (about 100GB in size)

R/W Share on a linux NAS for that folder.

(my exact data: 102.957.787.521 Bytes in 309.654 files, spread on 15.384 folders)

 

Both shares (Win/Linux) are synced and identical.

 

(Important: u need to have other large R/W or R/O shares configured in btsync too. In Win7 to Linux. So btsync has a lot of work to do, while stoping or starting!! I have about 20 shares with a overall size of 2.4TB (my exact data: 2.536.255.076.315 Bytes in 2.551.737 files, spread on 153.502 folders))

 

  • Delete or change (i.e. rotate a graphic file) some files from some folders of the win7 Computer.

    Keep an eye on the transfers and make sure, some of the changed files (not all) are synced to the Linux Machine. (btw. you can't keep an eye on the transfers in 1.4, which really really isn't helpful!)

  • While you restart the linux btsync (or rebot the machine) change more files on the win7 Computer.
  • While the Linux Machine btsync is starting (here it takes about 10-30 minutes here for the webgui to show all the shares, because of the huge indexing procedure, I guess) just stop or restart the Win7 Computer btsync some times, like it is needed to reboot some times for Windows-updates in a row.

 

When u repeat those steps several times (restarting btsync while indexing and/or transfer of files is interrupted), the deleted files on the Win7 Computer keep coming back from the Linux Machine at some time or the changed files on the Win7 Computer are overwritten by older ones from the Linux Machine.

 

Now consider, when u delete or change (i.e. rotate image files) some files on the Linux Machine as well as on the Win7 Computer, what a mess btsync is creating...

 

 

btw: when u repeat thos steps, u will realize sometimes, that btsync is apparently working, but no files are transfered since a long time (ie. 2 days). When u then restart btsync, transfers go on as usual. thats what happens here a lot (on Win7 Computer *and* Linux Machine)

 

 

 

In the end, its just a "normal" situation, where ppl need to restart their computers some time, while btsync is indexing or transfering and therefore btsync can't keep up with the changes, I guess, because I'm monitor this since a long time and the reasons seem always the same, because my behaviour keeps the same :-)

 

Anyways, thats not a funny behaviour of btsync and should be addressed/fixed soon.

Share this post


Link to post
Share on other sites

I also have a setup with 3 clients (Win - ARM - Linux AMD 64) and can confirm that this happens to me running latest version of BTSync an all machines. I'd like to add one hopefully valueable info: Although the time is set by ntp and verifyable right, the debug logs of the two linux machines are fine, but the time shown in the WebUI and the logs which can be downloaded are FALSE. Just to again clarify, I started the client with logging to a file. This file has the same info as the downloadable EXCEPT the timestamps. One funny thing is that the sync date in WebUI is set to "in 23 years" :-)

Share this post


Link to post
Share on other sites

@RomanZ:

 

All the files are signed with digital signature before they are sent. Private key is owned by RW peers, while RO own only public key. So, RO peers may only receive files - and none of RW peers will accept modified files from RO as they are not signed.

I checked the keys, and the master, on which the lightroom catalog was edited, has both keys (naturally), and the slave (Synology Diskstation) has only the RO key.

 

- .!sync files are ignored and never synced as they are service BTSync files.

Well, after the disaster I found only the .!sync files, and nothing else.

 

So I suspect that you have yet another RW peer, that decided to sync your Lightroom database. Are you confident that you have only 2 peers?

Yes, 100% sure, there are only these two computers which have keys to this folder: one Mac (master) and one Synology (RO)

 

Also, did you check that deleted database was moved to .SyncArchive folder?

sure, but it wasn't there.

 

BTW: Lightroom saves the database several times while it's open, similar to Excel.

 

Here is the current history from the Mac, running the current BT Sync 1.4:

2 hours ago

Finished syncing with Schroeder

2 hours ago

Finished syncing with Schroeder

2 hours ago

Finished syncing with Schroeder

2 hours ago

Finished syncing with Schroeder

2 hours ago

Remote peer removed file Lightroom5.lrcat-journal

2 hours ago

Remote peer added file Lightroom5.lrcat

2 hours ago

Finished syncing with SALLY

2 hours ago

Added file Lightroom5.lrcat-journal

 

Schroeder is the Synology backup, which has only the RO key.

SALLY is a third PC, but it syncs only other folders, not the Lightroom folder.

 

What do the lines beginning with "Remote peer " mean exactly? The remote peer is Schroeder, having just the RO key, as I understand it?

Share this post


Link to post
Share on other sites

@mariano

 

I tried your scenario - it did not work for me. Newer files propagate, while older are deleted. Are you sure that files on 3rd peer (one that is shut down) are not touched or modified?

 

@simondarren

What about you? Was your issue resolved with 1.4?

 

@BeTheSync

Well, thanks for the detailed description! I strongly suspect that your Sync does not succeed with saving all the relevant data to the DB before OS terminates it (as you've got a huge number of files). It is very simple to check it:

- disable Sync auto run

- start your scenario with following changes:

  a. Start Sync manually when system boots up

  b. Before starting Sync, please open storage folder (%appdata%\BitTorrent Sync) and look if files with extension .db-wal and .db-shm are present.

These files should not exist when sync is off. If they DO exist, it means that Sync did not finished its session with Database in graceful manner - and, likely, lost some data.

 

@LosLukos

I strongly suspect that you are (or some other soft?) "touch"-ing or changing files when Sync is off. In this case touched / changed files will be considered to be the most recent once Sync starts up. Is this your case?

 

@WeeGee

"Remote peer" means Sync does not know who did it. When Sync knows - it replaces "Remote peer" with peer's name. There was a bug when RO peer "reflected" the removal command back to RW peer which caused file deletion (see my post above)

Share this post


Link to post
Share on other sites

@mariano

 

I tried your scenario - it did not work for me. Newer files propagate, while older are deleted. Are you sure that files on 3rd peer (one that is shut down) are not touched or modified?

 

 

Yes, I am sure. Just because the wrong update happens at the startup of the 3rd peer. I have to say that I've forgotten to update one peer with the 1.4 version (a 4th one). So maybe that this peer propagate the error. If I notice something else I'll send you the logs. I will try my scenario another time next week with the new version.

Share this post


Link to post
Share on other sites
@BeTheSync

Well, thanks for the detailed description! I strongly suspect that your Sync does not succeed with saving all the relevant data to the DB before OS terminates it (as you've got a huge number of files). It is very simple to check it:

- disable Sync auto run

- start your scenario with following changes:

  a. Start Sync manually when system boots up

  b. Before starting Sync, please open storage folder (%appdata%\BitTorrent Sync) and look if files with extension .db-wal and .db-shm are present.

These files should not exist when sync is off. If they DO exist, it means that Sync did not finished its session with Database in graceful manner - and, likely, lost some data.

The wal&shm files are present when btsync crashed, of course.

They are removed (or renewed) every time, btsync starts.

When btsync runs for some days and I quit it normally (i.e. w/o rebooting) the files you mentioned are not present,

but

I  found some *.db / *.db-wal / *.db-shm files dated early 2013. Can I safely delete them?

 

Anyways, it seems obvious, that btsync cannot write its files decently to the SSD, as a course of the problem, but knowing abot the reason, does not solve the problem :-)

Edited by BeTheSync

Share this post


Link to post
Share on other sites

@WeeGee

"Remote peer" means Sync does not know who did it. When Sync knows - it replaces "Remote peer" with peer's name. There was a bug when RO peer "reflected" the removal command back to RW peer which caused file deletion (see my post above)

 

I just looked at the history entries:

Some lines say Remote peer, some say the correct computer name, all refering to the same computer... seems like the name resolution not very stable..

 

 

I also found a "Renamed file" mesage, where the old and new name are reversed. Are there any other wrong messages in the current versions??

 

Sharing a link or key via clipboard also doesn't work, nothing is copied to clipboard. Sharing via eMail gives me a working link.

Share this post


Link to post
Share on other sites

@RomanZ:

ReThinking what u've initially said about the *.db-wal / *.db-shm files, and testing the scenario again the last night, the problem still occurs.

(This time, I gave btsync on the Win7 and Linux enough time to write their db-files to the disk, and checked for "ungracefully files: no files, beside the ones from 2013, I mentioned in my post above.)

It seems not a matter of interrupting btsync while shutting down the computers, but more a problem with the number of files/folders to catch up, when btsync is restarting.

(well, actually, it was obvious to me before, but the *.db-wal / *.db-shm zombie files were a good attempt to re-check this situation here, just to know and eliminate some more causes ...)

Writing this, I see btsync is overwriting some newer, touched files from the Win7 Computer with older ones from the Linux Machine, which I both rebootet several times as describes in my above post. (but checking, that the processes shutted down clear and left no files, like *.db-wal / *.db-shm)

So the question here is: what is the maximum number of files/folders btsync should be capable to work with?

(I guess, its not easy to answer because its depends on the speed of the computer and disks and filesystem, etc...)

Share this post


Link to post
Share on other sites

@mariano

I'll need logs. If you got a lot of files - set log_size preference to, say, 500Mb. Logs will be huge - but we'll get better chances to catch issue there.

 

@BeTheSync

Thanks for clarifications. Just to make sure I got your message: even when you give enough time to Sync to shut down, and it does not leave DB temp files behind - newer files are still overwritten. Is it correct?

 

It is not recommended to delete temp files - SQLite should delete them on its own.

 

As for the limit - indeed, it is only limited by your OS resources. Usually, Sync allocates around 1k of memory per file or folder. Also, deep files tree will prevent Windows from sending file change notifications to Sync (so change will be detected only during rescan). Finally, big amount of files mean longer rescan time, to it's advised to increase folder rescan interval.

 

@WeeGee

Some lines say Remote peer, some say the correct computer name, all refering to the same computer... seems like the name resolution not very stable..

 

It is stable, just some events come from a peer we don't have name for, only the ID. It is not very productive to put 16 bytes hexadecimal peer ID into history.

 

I also found a "Renamed file" mesage, where the old and new name are reversed. Are there any other wrong messages in the current versions??

Just checked in the lab - it is confused. Thanks for report, we'll fix it!

 

Sharing a link or key via clipboard also doesn't work, nothing is copied to clipboard. Sharing via eMail gives me a working link.

 

Under which OS do you do this operation?

Share this post


Link to post
Share on other sites

@mariano

I'll need logs. If you got a lot of files - set log_size preference to, say, 500Mb. Logs will be huge - but we'll get better chances to catch issue there.

I've send a message to you.

I've collected the logs from PC1 in debug mode. The other peers PC4 and HP530 are the cause of the abnormal update. The files updated were never touched or updated by those peers.

I've only turn off those peers for the entire day and than turned on.

All the peers run the 1.4 version.

I hope they will be useful.

Share this post


Link to post
Share on other sites

@WeeGee

It is stable, just some events come from a peer we don't have name for, only the ID. It is not very productive to put 16 bytes hexadecimal peer ID into history.

 

But I see both (correct name and Remote peer), with only one other peer connected, and the messages clearly refer to the same peer, the messages have only minutes between them... 

BTW: an ID would at least make it possible to differentiate between several "unnamed" peers (when there are more than one)

 

 

Under which OS do you do this operation?

 

OS X 10.9.4.

Share this post


Link to post
Share on other sites
@BeTheSync

Thanks for clarifications. Just to make sure I got your message: even when you give enough time to Sync to shut down, and it does not leave DB temp files behind - newer files are still overwritten. Is it correct?

 

It is not recommended to delete temp files - SQLite should delete them on its own.

 

As for the limit - indeed, it is only limited by your OS resources. Usually, Sync allocates around 1k of memory per file or folder. Also, deep files tree will prevent Windows from sending file change notifications to Sync (so change will be detected only during rescan). Finally, big amount of files mean longer rescan time, to it's advised to increase folder rescan interval.

Thanks RomanZ,

u got the message, yes, newer files are still (reproducable) overwritten.

 

(My Operating Systems are:

Win7: Windows7 64Bit newest Hotfixes (16GB RAM)

Mac: MacBookPro w/ Mavericks OSX 10.9.4. (8GB RAM)

Linux: Synology NAS DS1813+ (4GB RAM) )

Share this post


Link to post
Share on other sites

@BeTheSync

 

Thanks for info. We'll try to repro issue in the lab. Also, I've opened a ticket for you in our support - we'll make a special build intended to catch your issue. I appreciate your cooperation!

Share this post


Link to post
Share on other sites

This issue is still present event in version 1.4.91 Beta

I've collected new logs, they are in the same folder.

 

I hope it can be solved. If you have no idea or you not have the time to solve this issue would you please tell us?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.