Node Update Logic Is Profoundly Broken And Needs Redesign


Recommended Posts

Basically, from what I see, the node update logic the way it stands right now it fundamentally broken as a result of flawed design and invalid assumptions as to the node updates which inherently create the unresolvable logical problems.

It inherently creates the inconsistency across the nodes. The very concept of database to accommodate and save the update state of the share is not a reliable concept in case other nodes do file deletion, renaming of modification.

If, for whatever reason, the database gets damaged for example on the r/w node and one has to delete and re-add the share, all the memory of the last state of the share is lost and all the nodes become totally inconsistent in respect to what they should have in terms of the latest state.

Therefore, unless you implement a database journaling system, you can not possibly guarantee the consistency or database recovery. If database is damaged, which happens no matter what, how do you know which exact files are subject to update and propagation?

Secondly, as it stands right now, there is no such a concept as a "reference node". Therefore, it is impossible to determine what should be the correct state of all the files that might have come and gone as a result of user modifications, deletions and file renaming.

Furthermore, the r/w node can not be considered as a "reference node" or a "master node" with present logic, as it can not fully control what happens in the r/o nodes if they have made some file modifications, deletions or renaming.

Once the r/o node has done some of these operations, it is no longer controlled by the r/w node in respect to updates of the modified files, and, therefore, it is no longer consistent with the r/w node.

From what I see, with current update logic, about the only way to make it work reliably is to prohibit stopping of updates in case r/o nodes performed any file modifications. Those modifications should be undone and files should be restored to exactly the same state as they exist either on r/w nodes or their exact or the "best fit" copies.

If user renames some files, those should remain in renamed state, but the files with the original file name should be re-downloaded.

If user deletes some files because he does not want to download them any longer, he should add them to the .SyncIgnore. Otherwise, they will be re-downloaded again.

If user wants to modify some file under original file name, his modifications will be lost as the original versions will be re-downloaded. Therefore, if he wants to modify them, he should rename them and then edit them the way he wants. But the originals will be re-downloaded in exactly the same version as on the r/w nodes.

Therefore, any files that are not in .SyncIgnore, will be restored to their original state as they exist on the r/w nodes.

That is the only way to make it work reliably in the current design.

Else, I see no way, even theoretically, to eliminate what is called a "database inconsistency problem" as known in relational database theory.

It is simply logically impossible with current update logic and design. The very idea of BTSync database to maintain the latest state of the files is inherently unreliable, unless jounalling system is implemented. Because, first of all, databases eventually get damaged for all sorts of reasons, and only a journaling systems may prevent the effects of damage and restore the database to its correct state.

Furthermore, there are two copies of the state - one in the physical file system, as eventual reality of what is going on, and the other one, that may differ as a result of user file modifications, in the database.

On the top of it, several nodes may have totally different state of various files as a result of local file deletions, renaming and modifications.

As a result, you have a total inconsistency across all nodes and, in case of general public situations where you can not even contact the other nodes and tell them what to do to fix the situation so that their nodes fully update, what you have is unresolvable problem logically and you can not possibly assure the consistency.

One more time: In GENERAL public applications you simply MUST have the facility to present the users with node message. When new users come in, they have no clue of either BTSync or even what it means to sync. If any of them do something to the files in the share, those files, as it currently stands, become non-updatable and non-propagatable. But they do not even have a clue about how it really works and might expect some "miracles" from the standpoint of how BTSync works, at least in the present version. So, if something goes wrong, how do you tell them what to do if you have no ways and means to communicate to them except via Device name?

From what I see, in current design, about the only way to resolve the consistency issue is to make it a MUST to use the .SyncIgnore file in case users do not wish to receive the updates on some files for whatever reason.

All other user modifications that are inconsistent with the r/w nodes should be undone by re-downloading the user modified files if those modifications are not in the .SyncIgnore. That is the only way to assure consistency across all nodes as far as I can see, at least with non-journaling database system.

The "bottom line" is this:

The current design and update logic will never work, in principle, at least in general public situations where you don't have access and control of all the nodes and their data.

At present, and for months, I consistently see between 30 to 50% of r/o nodes in the incomplete update state, at least as it is shown in the r/w nodes. This happened, most likely, because they deleted, renamed or modified some files locally. As a result, the r/w node shows that there are still files to upload to those nodes even thought they might think this is exactly how they want it, at least in cases where they do not wish to download some files or modified some of them locally.

Why should the r/w node EVER see the incomplete r/o nodes where the nodes themselves might thing that is exactly the way they want it? Overall, it just creates an unpleasant feeling seeing it in this state.

From what I see, you should NEVER, under any circumstances see the incomplete nodes, even if they themselves think that is the way they want it. Otherwise, how do you know the real story with those nodes?

If have seen some nodes that can not update the share which has only 2 HTML files, and those are exactly the files they wanted to see in this share. But they no longer get the new versions of those files because they probably modified them without realizing what will be the result, and the result is that they will never get the updates of the main files of the share.

Does it make sense?

Link to comment
Share on other sites

"BitTorrent will never work!" said countless people.

 

Yes, what you are saying make sense.

 

The thrust of your "Bottom Line" seems to be btsync should have a "hide incomplete nodes" option. Good suggestion.
 

The current design and update logic will never work, in principle, at least in general public situations where you don't have access and control of all the nodes and their data.

 

I'm not privy to the actual "current design and update logic." So, I really can't say. But, you raise some important points. Let me propose a use case:

 

I share a "Summer Adventures 2012" folder with a few hundred photos. I send the r/o secret and a link to btsync download to a dozen friends.

 

They install btsync and setup a r/o folder. Each looks through the folder and deletes some number of photos. Then, the deleted photos are fetched again.

 

"add them to the .SyncIgnore" is not a viable option for most "average" users.

 

IMHO, the users of r/o nodes should "fully control what happens in [their] r/o nodes." Even if that means they are divergent from r/w nodes. btsync should present some indication to the r/o users of how they are divergent and offer them options for convergence.

 

r/w nodes should all eventually be consistent.

 

If a btsync database file becomes corrupted on a r/o node, sync'ing should stop and users should be presented with a choice: sync folder to full copy OR (continue) stop sync'ing of folder OR delete folder from btsync (not filesystem).

 

The btsync API should support "node status" and "folder status" messages. It should also accept the options suggested above for failure recovery and other divergence recovery options. And someone should write an SNMP interface to that.

Link to comment
Share on other sites

"add them to the .SyncIgnore" is not a viable option for most "average" users.

Then what is it for?

IMHO, the users of r/o nodes should "fully control what happens in [their] r/o nodes." Even if that means they are divergent from r/w nodes. btsync should present some indication to the r/o users of how they are divergent and offer them options for convergence.

I totally agree with the idea that the nodes should have a final say on what is or what is not on their node.

Except there is "but" here.

You see, using .SyncIgnore in the current design ASSURES consistency across all the nodes if logic which I proposed is implemented.

Furthermore, if modifications are reflected in .SyncIgnore, then BTSync reports the state of completion correctly and you won't see the incomplete nodes. Because the size of all incomplete files is accounted for in the calculation of total size remaining to upload. And, if I am not mistaken, you WILL see the message "Finished syncing with nodename" in the history tab. That will tell the r/w nodes and all other nodes that syncing process did in fact take place and it DID complete successfully. Otherwise, you won't get any message and you will never know for sure if the sync process succeeded.

Once you allow users to arbitrarily modify their files and those modifications are not included in .SyncIgnore, you are GUARANTEED to come to the state of inconsistency, "where left hand does not know what the right one is doing".

It is simply a logical inevitability and eventuality.

Simply because the r/w nodes do not "know" what is in minds of other nodes and why did they modify files. How could they possibly know that, especially if you consider that databases get damaged all the time and you have no way of tracking the transactions (file deletions, modifications and renaming in this case).

And even if database is not damaged, the way it stands right now I see the inconsistency between the r/w node's view and the r/o nodes view and that is precisely why I see at least 30% of incomplete nodes, and the sorriest thing about it is that the users do not even realize what have they done by modifying files and that from then on they will never receive the updates on the most important files because of which they have joined that share to begin with.

Can you imagine what is going on through the minds of "clueless" users when they see about 30% of other nodes staying in the incomplete state forever and never get updated properly? In my opinion it is a nightmare for any product to exhibit this kind of unreliability and unpredictability of behavior.

It is like you are doing the FTP transfer and it does not complete, or are looking at the web page and it never completes loading the page and you see this loading wheel spinning and never stopping. How do you feel as a user at that point?

Therefore, you either accept .SyncIgnore as about the only way to communicate meaningful and reliable information about the state of the node, or you simply flatly reject the very idea of .SyncIgnore.

Simple as that.

Again, I guarantee you that BTSync will never be considered a reliable and professional product if the issue of consistency is not resolved in RELIABLE way.

That is all there is to it.

Yes, it will work, but forever "almost" or "in most cases".

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.