Breeding zombies


rdebath

Recommended Posts

Whenever you do replication you need to store zombie files to stop the vagaries of unpredictable network delays resurrecting files that have been physically deleted. (Sometimes called ghosts or tombstones ... talk about theme naming!)

BTSync replicates these around the swam by actively pushing them to new peers so they're almost impossible to kill. (hmmmm...)

This is a reasonable way of working except for the fact that these zombies are processed very inefficiently.

Suppose you have several peers, peer one is storing one file per second into a directory. BTSync is used to transfer these files to the other peers. Once they've arrived the other peers process the file and create an "I've seen it" flag file. This flag file gets replicated back to the first peer, when it sees enough "seen it" flags the first peer deletes the file and it's flags. Every file, of course, has a unique name. That quickly mounts to hundreds of thousands, perhaps millions of files.

Within a few days BTSync dies, before that it's using 100% CPU continuously.

This is because the zombies are stored in the main peers.dat file where they bloat it and slow down all processing. As they only need to be consulted when a (potential) new file appears IMO they should be saved in a specific zombie.dat indexed file not in the main database.

It would also be nice if there were a way to manually clear the zombie list short of shutting down and resetting all peers in the swarm at the same time.

NB: There are other ways of working that don't need explicit zombie records, but they all have this differentiation between current data and historical data that's being stored on the off chance that an old peer will reconnect.

In all cases the historical data has no other use, so should not be in the main database, but must be kept until either all the peers that have ever connected have reported in or the last peers are given up for dead.

BTW: ... how long do zombie records stay in the current BTSync database ?

Link to post
Share on other sites

I was thinking about this in the car, can we please have two timers.

One for deletes done by this node and one for deletes done by some other node.

The idea is that there are two cases when a delete might get reversed.

  1. The file is still currently being copied round the network so the IHAVEs haven't stopped rippling around.
  2. A formerly connected node rejoins the network with an old image of the share.

In case one it's good for everyone to know that the file isn't wanted so we need the zombie records to hang around everywhere until this stage has finished. Once it's over most of the zombie records are not needed any more, we just need one 'on the off chance' that somebody comes back. In the second case the logical peer to hold the long term zombies is the one that believed it was important to delete them in the first place.

The trade off should be that if a peer leaves the network then it's old deletes won't get given to any returning nodes; that's two unlikely events together. IMO a reasonable trade off for a fast clean up.

PS: Yes it Urist I liked the name myself :)

Link to post
Share on other sites

Archived

This topic is now archived and is closed to further replies.