stanha

Members
  • Posts

    141
  • Joined

  • Last visited

  • Days Won

    3

Everything posted by stanha

  1. Well, there are number of reasons for sync not starting. Enable logging. That may, and probably will, give you some clues of what is going on.
  2. I LOVE to hear things like these! Well, what I mean is this: I have given a r/w key to some guy who either went "rogue", or started doing something crazy without realizing the consequences. So, basically, he can screw up the whole share pretty bad, especially if he has some not so noble intentions. In this case, what are my options to prevent him from screwing up the share pretty bad? It seems to me that about the only thing I have control of is to somehow change the key. But then I have an issue: how do I inform other nodes about key change? So, I thought I could change the key and then I could go back to the old key by simply replacing the SyncID file, so that it corresponds to the old key and add or modify some file to put the message informing them of a new key. Basically, ALL I want is to find the solution for the r/w keys going rogue and starting to screw up the share. Not sure it makes technical sense to you, but I do not know how to ask the question that makes sense technically to you. Does it make sense NOW? (Otherwise, I am going to cry...)
  3. Strange behavior with time difference error message and syncing There is no point for reasoning on the issue that is considered as user error.
  4. As to the number of messages on this thread, sorry, I am not quite concerned as to the density of reply messages vs. total messages during given span of time. Secondly, the sheer fact that you see not many replies on this thread is not a meaningful parameter to me. Because if I see a thread with the same exact problem I have, it does not mean I have to POST on that thread just to say the same thing they are saying. Some people may not post just because they may feel shy or afraid to look "stupid", and this is a fact to me, not a theory. But the very fact that 15 people a day read this thread means that some of them, if not many, keep reading it again and again. Why, if the problems has no significance as you seem to imply here? Fortunately or unfortunately, but I am a troubleshooter by profession and about all I do quite often is to identify the problem and then solve it. So the system is stable, predictable and clean. And that is what they pay me for. So, there is no such a concept as "small problems" in my mind. It is all a matter of tradeoffs and priorities. I have learned with time to never ever ignore ANY kind of problem or a bug, just for the sake of adding more and more features while old features do not quite work "under some unusual conditions". To me, about the most important criteria for any s/w product is robustness and the most detailed and precise documentation. But the features and "bells and whistles" can wait. The program should behave like a tank under any and all conditions, usual, "unusual", conceivable or not. That is priority number one in my book. Burying your head in the sand do not just quite cut it in the s/w business. That is probably the toughest business to be in, and especially in the Silicon Valley. And that is the context I am talking about. In today's world of high standards of quality and robustness, any kinds of problems just do not fly. It is much easier to turn off the people from your product then to get their initial interest, and once that happens, you lost "the customer" forever. As to "Where are you arriving at a figure", I get it from elementary statistics of the number of people that I am seeing on this share and other shares as well compared to the number of nodes that have the error message. As I said, I am seeing this message every day. Why? - I have no idea. It depends on which specific facts you base your judgement on. How do you arrive at the "MASSIVELY overestimated" qualification of the issue? Do you have statistics to prove something? Have you run the public polls? Can you place your particular judgement on relative importance of such an issue in other cases or applications? I am sure you are! And I also appreciate your appreciation and compassion and I hope you appreciate my appreciation as well. And I hope you appreciate the fact that in our particular case, seeing a number of people that had to eventually abandon the share because they could not even start syncing, is a critical issue in our case, if you do not mind, even from the standpoint of an image it creates in people's minds. Because we are dealing with the issues of utmost importance and urgency on a scale, I am not sure you can sufficiently appreciate, if you are not aware of the issues. But that is a matter of subjective judgement. And I hope there is a reason for you to post or even bother about such an insignificant issue, as you seem to present it. Why bother? Do you have a SOLUTION that I do not know about, that can make those error messages go away in a general public situation? I can not check the correctness of time or time zone set on other nodes. Because it is a general public situation. All I can see is my BTSync logs reporting time difference. Thanks for the link. I did exactly what you have proposed, and thank you for your idea. I am kind of curious at this point to see their response if any, especially in the context of their claim that "Having right problem description and logs will mean that your problem will be fixed in a matter of hours." (the original bold font is preserved). I'd LOVE to see that fixed "in a matter of hours". That might make me LOVE BTSync and never part with it even if some other guys develop something even better, unless of course it will be an Open Source solution, in which case I'd prolly say: "see ya later, aligater". But the idea of BTSync is great. I have to admit. And it arrived just in time in the context of global events. In my personal opinion, there is probably no s/w product at this moment that is as significant as sync on p2p basis, especially with reliable and trustable security arrangements. Because at this point, the global and uninterrupted and reliable information exchange is probably the most critical issue of all. I hope you don't mind this qualification.
  5. Setting 777 mode means the file becomes writable by the WORLD. Not a good idea. Use 775 instead, or, better yet, 2775 for the top folder. That sets the SGID bit and if you changed the group of all the files to the group that runs btsync process, then you should be in a good shape.
  6. Reliance on clock time First of all, the amount of views of this thread is 3,787. This, by itself, indicates that this issue if one of the most important issues of all. In my personal opinion, it has to be placed as the priority 1 item on the priority list and, as soon as it is resolved so it no longer affects a significant number of people, a new release needs to be made, regardless of all other issues that are resolved or are in semi-resolved state. It would be great to see a new version within the next few days. That is how critical this issue seems to me personally. But, hey, everyone has freedom to do what they please in this world... Here is the most detailed reasoning I could come up with on this subject: Well, I am seeing somewhere around 30% of people that can not start syncing because of this "time is off by more than 600 seconds" message. And that is no longer simply a matter of joke or stupidity on the part of users. It seems that current approach is fundamentally incorrect, at least if it causes such severe problems. May be it would be better to keep the same logic but to sample the current time on both nodes and offset the accounting for the potential time difference by the value of their clock difference. The clock difference is a matter of pragmatic fact, and not of a theory of "perfect world" where everything is synced to atomic clock and time zones are set correctly. In my case, currently there are 2 nodes that can not sync, and for DAYS, sitting on this share. In one case, their clock seems to be exactly +1 hr. and 19 seconds off, and on the second node -1:05:09 (HH:MM:SS). So, it seems their time zones are off by one hour and in the second case, the time is off by additional 5 mins 9 secs. As I mentioned before, I have displayed a message in my device name about setting the time correctly, and, strangely enough, there seems to be no change. I am not sure they are so dumb that they can not see the message with the URL pointing them to the document describing this issue. But the fundamental question is: why can't the sync, at least initially? Secondly, how much significance and impact the incorrectly set time may have on the overall state of data compared to total refusal to sync? Even if time is off by and hour, or even several hours, what is the statistical chance that there will be a significant negative impact on the state of the data set? If one file happens to have been updated and it happens to be that the update was done within the time of clock difference, the chance of which is pretty much close to 0, and it happens that you download or upload the older version of it to the node that has its time set incorrectly, what seems to be the problem, at least in comparison to total refusal to sync the entire data set? Yes, theoretically, it is a problem, because the older file version may get downloaded to the node that has its time set incorrectly and step on a newer version that exists in other r/o nodes. But, even in that case, they still have the the archives. - Well, statistically speaking, one or few more files might be potentially affected, but ONLY if there is no r/w node on-line and the other r/o nodes think that your version is older than theirs and so they do not want do download your version of the file. But what impact does it have on the entire data set compared to total refusal to sync ANYTHING, just because in SOME cases your clock difference may affect a single file? Does it make ANY sense, logically speaking? Ok, fine, you might have some argument. But then, even if your argument is really valid, you do not download or propagate ONLY files that fall within the range of the time difference between the nodes, that is, if you find ANY of such files. Most likely, their number will be 0. But all other files that have their time stamp outside the range of clock mismatch must not be affected. There needs to be an estimation of comparative tradeoffs you have to pay for such severe measure as not syncing the entire data set. What are the ADVANTAGES of refusing to sync, even if some file is off by an hour or so, and it happens to be the file that has been updated on one of the nodes? The statistical chance of such a coincidence is probably on the order of 0.0001%, in the BEST case. But the probability that you will affect the entire data set or significant part of it rapidly approaches 100%. So, what is your payback for this "feature"? Does it make sense? May be the very idea of "file stamp is in the future" because of the time zone differences need to be reexamined or the very mechanism of relying on the file time stamp differences has to be looked at. At least one of the nodes I have described above should sync. Because its time is not in the future, but in the past from what the r/w node thinks it should be. Otherwise, the net effect would be equivalent to you not being able to do the FTP transfer to the site, which has a "newer" version of the file and it might look like you are making some mistake. And even if you do, at least you should be prompted with a message "do you really want to do this" or something of that kind. As I understand this issue, the time stamp seems to be utterly irrelevant. Because the hash of the file CONTENTS is used. In this case, no matter whether some file has the time stamp in the future or in the past, it seems to be not as relevant as the file contents. One of the main rules of logic, as I understand it at the moment, ALL the nodes have to have EXACTLY the same contents as the master (r/w) nodes, regardless of anything. Otherwise there is a database consistency issue. Yes, if there is no r/w nodes currently on line and files on both ends have different contents and different time stamps then the assumption is made that the fresher file "wins". But then, if there is clock difference between the nodes, it needs to be accounted for in order to find out which version is newer. It does not seem to make much sense to prevent the syncing of the entire collection because of such things, at least during the INITIAL sync, which should be allowed unconditionally, no matter what, at least if there is an r/w node present on-line, it seems. I just hope this issue gets resolved, and pretty soon. Because I see these errors every day, morning to night and on several different shares. And I am not sure how many people had to eventually give up and leave the share, which is considered to be something very undesirable in our situation. This does not make any sense to me. There seems to be something fundamentally wrong somewhere, unless we are simply dealing with some plain bug.
  7. I am not even beginning to contemplate to use the API in current context. Actually, if I decided to develop something, I'd consider the open source projects, like Hive2Hive, for example, and, considering that it is written in Java, it makes it portable across a number of platforms. http://hive2hive.org/ Actually, in the context of FSF community making the BTSync clone a priority item for Open Source development, which means that BT has no more than 3 to 6 months "to stay ahead of the pack" until others catch up with them, it seems a little strange that there are no public statements forthcoming from BT. There are already projects and/or products that can do essentially the same thing as BTSync, regardless of who "looks better" and whose version is more stable or closer to the door. But the clock is ticking...
  8. If I am not mistaken, you set sync_trash_ttl to 0 or to some huge number of days in the worst case.
  9. I see a logical conflict with the idea of syncinclude file. Current approach uses "positive" logic. The default is "include all". From that set you can exclude whatever you want, which is logically consistent operation. If you are proposing syncinclude, that is the "negative" logic (depending on how you look at it). In that case, what would happen if some file was in both, the .SyncIgnore and .SyncInclude file? How would you resolve that logic? And mistakes like these happen all the time. So, it seems either you completely get away with .SyncIgnore if you propose the .SyncInclude, and you simply deal with the opposite logic, or the very semantic meaning of .SyncIgnore becomes unclear. Why do you need to ignore anything if your default logic is to ignore EVERYTHING?
  10. I see. Thanks. You are pretty helpful... I'd say not a typical in many support forums for other s/w products. Is it also used in the case when key (hash for share) is changed? And, if so, can I have 2 copies of it, - one for the old key, so I can continue and/or update some things on the old key, and the 2nd one - for the new key. So, even though the key has changed, I can switch back and forth between the versions, assuming I also change the key in GUI?
  11. This is certainly something I'd like to know for sure. On which node did you set max_time_diff 0? Only on the node that would refuse to sync? Or on a master (r/w) node as well? - No point in barking up the wrong tree.
  12. BTSync File Transfer Behavior Table This table was updated in the original HTML version:
  13. I like this idea, just like it is currently done on mTorrent, for example.
  14. I assume you are talking about both, the r/o and r/w nodes. Correct me if I am wrong. In that case, can I assume that those nodes (r/o and r/w) are still going to propagate it to other nodes unless they are added to their .SyncIgnore? And, once we are at it, do you have any objections to to the idea of using the .SyncIgnore.Master file instead of automatically propagating the .SyncIgnore, which will step on the local copies of it, at least on r/o nodes? It seems that it at least addresses your concern about O/S dependencies and does not automatically override the individual preferences on the r/o nodes. Secondly, if it is automatically updated on the r/w nodes as well, then, the original r/w node that created this share, which I call the "reference" r/w node, might get a mild attack by other rogue r/w nodes, if those, for example, exclude ALL the subfolders and files from propagation by simply modifying their .SyncIgnore file, in which case, ALL the nodes would stop being updated without even knowing it. Or, the other r/w nodes might simply make a mistake or get the wrong idea in their heads about the original intent for this collection by the original reference r/w node, and he, being the author, might not like this idea because it may be damaging the the very intent behind this collection, which they may or may not know. I am more interested in general public applications of BTSync and not in well controlled predefined private networks or well cooperating nodes in some virtual network. And in those situations, all kinds of things may happen and, even if you change the key to this collection in order to exclude some other r/w node, that have gone "rogue", it seems to me that the collection with the original key is still accessible, or isn't it? Secondly, in general public applications how would you inform all the nodes that the key to this collection has changed, especially considering the case that some of them may not come back on line in days, or may be event weeks or months, and when they come, how do they know that the key was changed? It seems to me that when you change the key, the only thing that changes is the access to the collection, but not the collection itself, and the databases on all nodes remain exactly the same as they were before the key change. I'd like to understand more fully what happens if you change the key. In particular, is it true that this collection is still accessible via old key except it is not going to be updated via NEW key and the only nodes that are going to get the updates are the nodes that have the NEW key? Adding an additional message mechanism or extending the size of Device name field in order to provide more informative message to the nodes In this context, in some other posts a while back I proposed to add some mechanism for the nodes to display an additional message from their node, where they, for example, may post some problems they are having, or post their requests for collection extension by the master or reference nodes, or the r/w nodes informing all other nodes of some changes to the collection, or give recommendations to those nodes that they notice having some problem syncing. For example, I am having a few nodes that keep displaying the error message about their clock being different by more than 10 minutes. And they are sitting on a share for days, but their nodes are not updated for some unknown reason. Also, if I recall correctly, I saw at least one node that simply would not start syncing and I have no idea why and, basically, have no way of asking him or recommending something to be done. Right now, the Device name field is too short to display a meaningful message. So, I am kind of concerned with this issue because I would not like to see some people simply giving up on this share just because they could not even start syncing and would simply leave without me even knowing what was the cause of their problem. I hope this issue is going to be looked at, and to me it seems like a priority item and not some "luxury" feature. Plus, considering the fact that this mechanism could be implemented within hours, if not minutes, at least in its most basic form, leaves me with hope that you guys will indeed look at this issue - the sooner, the better.
  15. Well, but, just as I said, I was not GOING to... And I am certainly not going to ask this one: But why do you NEED it in the first place?
  16. Well, I can verify that there seems to be something funky going on with device time difference too great message. But it is too funky to describe in one sentence or less.
  17. Well, that is what I was afraid about. My original reply which you are following here did have more extensive analysis of logic, and the conclusion was that it is not going to be easy to make the logic work. Let me just mention a few things even though I have not seen your design spec, (if you have one to begin with). I can see that everything should work just fine when nodes are on-line, the "reference" r/w node, as I call it, and the 2nd - "the evil" r/w node, that only comes once in a while and his database does not contain all the modifications, deletions or file exclusions that happened when it was off-line. When nodes are on line, you are pretty much guaranteed that the events will be received. But if the event is tied directly to the O/S level and are generated as a result of O/S events, on reference node, that means to me that the events are "one time only". So, if the "evil" r/w node comes back a week later, he won't get the event, and, therefore, his database would be inconsistent with the reference node. So, if reference node did a folder rename while the evil node was off line, then, when it comes back on line it is pretty simple for it to get the renamed folder with NEW name. But it has no clue about what happened to the old folder and can not process it properly, and, probably, because the old folder was marked as "dead" on the reference node, except I do not quite see yet the whole logic between the r/w node updates. So, that is why I was thinking that in order for the state to be appropriately updated across the sessions, the event list has to be stored in the database. In that case, the 2nd node would be guaranteed to receive the event. But then we have even more complicated problem: are the events stored on per client basis? That is, how do we know which events were received and processed by every client, that might have come and gone? In a general public application, you might have thousands of clients coming and going, never to come back again. So, even if you decided to store the event lists in the database to survive across the sessions for each client individually, which looks like a royal mess, then again you have yet another problem. If reference node gets damaged for whatever reason and decides to delete and re-add the folder, then it is going to be reindexed with what you have on the disk and all the even memory would be gone. In which case, with multiple r/w nodes, you have a total database inconsistency. But if you do NOT store the events in the database, the very fact that you use a concept of the events, would imply that events might be lost and mod actions do not survive across the client disconnects. But what blew my mind is that after I wiped out the contents of the folder and reloaded it with the source files and that "evil" node was on line at the time, I could not believe my eyes when I got all the files back from him that were deleted on the reference node, even though the time stamps on the files were certainly fresher then his. From what I see, since the reference node was reindexed afresh, it did no longer have any idea of the previous version of the database and its database did not know anything about the files that were renamed or deleted. But on the "evil" node, those files were still there, and, even though his file time stamps were ancient, it still fed the reference node with files that no longer should exist on the share as they were deleted from the reference node. The only solution was to include those files or folders that were renamed or deleted from the reference node to its .SyncIgnore. Well, what can I say but to wish you, guys, good luck in trying to resolve this logic. But, no matter how you cut it, you, guys, did a great job so far on BTSync and I applaud you for what you were able to get accomplished by now. Yes, some things are extremely complex logic-wise, and that is, partially, a reason for me to keep going through all the myriads of logical conditions arising out of various user actions and connect/disconnect conditions and so on. I just hope that it might stimulate you to create the precise operation tables that cover every single condition logically conceivable and see if you can provide the universal logic what would be consistent and reconcile with all other user actions and so on. Well, if I recall correctly, it states somewhere that BTSync is most likely to detect the O/S level file modification events. But I had the impression that it only helps to resync the mods with all other nodes as soon as possible. But it was not part of the logic. And now what I am hearing that the consistency across the nodes does indeed rely upon events, and those events may or may not arrive for variety of reasons. That means to me that the events are just performance level tricks that do not fundamentally affect the logic of database consistency level issues. That is why I proposed a wile back in one of my posts to add more bits in the database about the states of some file, such as "does not exist on disk during the last sync attempt", "updatabe" and "propagatable", that, in combination, give you much more precise control and deliver a more precise state of the system regardless of operation. From what I recall, currently you have only one or two bits and "update" (downloadable) and "propagate" are essentially merged into one - "dead file" or "alive file", and "dead" files - those that were either deleted or renamed, become permanently dead and their states can no longer be restored, at least as far as r/o nodes are concerned. The "bottom line" is that you can not rely upon events in terms of database consistency across the nodes and disconnects. There needs to be a different and more non-volatile mechanism of assuring consistency. I am really happy to hear that. The problem is too complex and its effects are much more severe than it might look at the first glance. Because once the user looses confidence that ALL his files are in tip-top condition, then we really have quite a problem to resolve, and to restore confidence is much harder then to create it initially. Well, that is a typical developer's problem. They are too concerned with releasing more and more code and "features", but the understanding of all the possible cases is put on the back-burner. But, in case of BTSync where logic is so complex that it is not quite easy to even enumerate all possible or "impossible" cases, creating documentation may force and help the developers to clarify the issues in their minds instead of constantly adding new and new bells and whistles, while the basic system is not quite "up to snuff" as it turns out in those cases that they might not have even considered when they were creating them. Well, any way you cut it, I wish you good luck and I am doing all I can to make sure every single bit of logic is covered in all the possible permutations, and that is precisely the purpose of my detailed manual. Because the very idea behind BTSync is a dynamite idea and it is urgently needed for all sorts of uses today. In my mind, there is probably nothing more important in modern information world than p2p dynamic information distribution. Torrents are static and that is main disadvantage. The information can not be extended or even modified. So they get "stale". But syncing files on dynamic basis gives you the real-time updates, which is exactly what is urgently needed today. So, considering all the complexities, there is simply a dire need to clarify every single permutation of of every possible condition. There should be no "magic" in ANY situation from what I see. Otherwise, the forums are busy with all sorts of questions and issues that should not even arise if everything is described in detail in the operations manual. Because it is just a royal waste of everyone's energy that could be used on productive things. That is about ALL I can do, unless I see your design doc with extensive and detailed coverage, so I don't have to sit here guessing and saying "wow, look at that one! What planet did it come from?".
  18. Thanks for the info. I bet you are right. I recall vaguely that I have deleted every single file from the main folder when I saw 100 megs of files being downloaded back to the reference master from one of r/w nodes that had the older picture of the state of affairs. So, when I copied those files back from Win explorer from the original source, I probably did not copy the .syncID file, not even realizing that it is a critical file to have around. I am not going to ask what is .syncID file for?