Data Corruption Syncing Large Package File (Osx)


macula

Recommended Posts

I have been trying to transfer my DevonThink database from one machine to another using BitTorrent sync. To be sure, I am not trying to "sync" the database between the two machines. Rather, with the app itself closed on both machine, I want to transfer the fresher one to the machine that has an obsolete version. Once the transfer is complete, the plan goes, I can open the database on the destination machine, while closing it on the source machine.

 

The trouble is that the database is rarely if ever synced correctly. This is admittedly a large "file" (>2GB) and it is a "package file", which is basically a folder structure in OSX with a bunch of metadata attached.

 

The problem is that BitTorrent sync transfers this large file-folder incorrectly. There is almost always a missing markdown file or something missing. Perhaps BitTorrent Sync rushes to trigger the sync, beginning to upload revisions before all the changes inside this humungous folder-file have been committed?

 

Any thoughts? This reminds me of another older post on this forum (at http://forum.bittorrent.com/topic/26271-iwork-13-numbers-pages-and-keynote-on-mavericks/), where the user had troubles syncing iWork files (which are also package files) via BitTorrent.

Edited by macula
Link to comment
Share on other sites

@macula

Actually Sync does not differ a lot between Mac's bundles and regular folders. It simply delivers all the files inside and that's it. The only peculiarity I can think of is that bundles receive notifications from OS about changes differently. Therefore, Sync does not deliver some locked files indeed, although it should deliver them latest after 10-20 minutes as it should be found by folder rescan.

 

Could you please do the following test:

1) Reproduce the issue. I.e. make changes to file, ensure that markdown file is not Synced.

2) Give Sync 10-20 minutes. Check again if markdown is Synced now.

 

If it still does not help - could you please send me a debug log from both your computers? Just hit the "Help -> Contact support" inside app and mention that log goes for Roman.

 

Thanks!

Link to comment
Share on other sites

  • 2 weeks later...

I've noticed the same thing. This occurs even when the large package file is changed only on one machine; it becomes corrupted on that machine. I think it also has something to do with having the package file open for awhile. Somehow BitTorrent Sync is not realizing the file is in use, syncs it, and then (I believe) replaces an internal file within the package that has been updated in the meantime with the older version that it synced.

Link to comment
Share on other sites

  • 1 month later...

I am bumping this thread after a long hiatus, for which I apologize. I have been far too busy to debug this as @RomanZ suggested, and gave up on BitTorrent Sync for this particular package file, which is crucial to my work.

 

But a couple of days ago, I had a moment of epiphany.

 

First, I added the following lines at the top of StreamsList:

 

*
com.apple.*
org.openmetainfo.*
com.dropbox.*
kMDI*
 
The first of these lines alone should suffice but better safe than sorry. The point of this is to enable syncing of all extended attributes.
 
Second, I revisited my IgnoreList. And indeed it seems that one of the problematic files was being excluded.
 
Following these two steps, there was only one lingering problem left, a particular Markdown file within the package with a very long filename containing special characters. I renamed this file to something simpler, and following that everything has been working well.
 
I'll give this a few more days and will report back if I encounter problems. 

@RomanZ, may I ask, are there any side-effects in syncing all metadata? And are there any filename restrictions in BitTorrent Sync?
 
Thanks.
 
 
 
Link to comment
Share on other sites

@macula

Side effects possible:

1) computer-specific data synced to another computer. Some of the data stored in xattrs is computer-specific, i.e. being synced to another computer can make it invalid. It highly depends on actual data and application consuming it though.

2) extra traffic

3) extra space occupied on other peers.

 

So I would advise to follow white-list way of syncing xattrs and deliver only those that you actually need.

Link to comment
Share on other sites

  • 2 years later...

this is quite an old thread however relevant. @jbb & @macula as i am looking to do the same.

 

1. Are you still having problems with Resilio and syncing DevonThink databases?
(assuming both computer db are closed when syncing or at least in the final sync)

2. How does Resilio handle versioning considering DTO (DevonThink Office) are just scrambled pointers?
Be curious if you've known how to find those previous version files either within Resilio or DTO?

The other alternative here is to use WebDAV as DTO as a remote database.
I am having little luck (aka tech know how) with my router and settings...

Link to comment
Share on other sites

@RoninSix it's been a long time and my memory may well fail me. The DevonThink devs are as adamant as ever that this is bound to cause problems. The question has likely become moot, anyhow, because the new sync mechanism built into the app is fast, reliable, and flexible. Versioning is not supported, however, neither is Resilio.

I would strongly advise you not to store your DevonThink databases in a Resilio-synced folder unless you are prepared to experiment and the data involved is expendable.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.