Metacache & new user experiences.


Recommended Posts

I've been trying (for 2 days) to sync between two Win8/Win7 computers on the same network a large and unwieldy folder set.

There are about 400k files, in only 60GB of space. It's old directories full of subversion remnants, massive data export dumps in single files, things like that.

It has been impossible with BTsync. Fair enough, it was a tough (unusual) task.

-> metacache! NOT a great idea.

Basically, the entire problem was exaggerated by the need to have 100,000s of files in your metacache structure. It might be perhaps a better idea, rather than to shard them as you have done, to merge them into file blocks. This would free up much of the MFT / Filesystem resource intensive work needed to sync anything large.

I'm not sure, but it seems like you maintain a separate cache item for each item we need to sync. That's doubling the effort required.

Just my two cents, I think the overall idea is great - can't wait to get to use it.

Cheers,

Frank.

Edit: Sorry I have no debug logs for you, most things ended in a forced close - I noticed these things also:

-Some files were missing, but still trying to sync (cache out of sync?)

-Massive changes in used memory >2GB at certain times then back to 500mb

-Out of memory exception at one point, didn't catch it.

Link to comment
Share on other sites

-> metacache! NOT a great idea.

Basically, the entire problem was exaggerated by the need to have 100,000s of files in your metacache structure. It might be perhaps a better idea, rather than to shard them as you have done, to merge them into file blocks. This would free up much of the MFT / Filesystem resource intensive work needed to sync anything large.

I'm not sure, but it seems like you maintain a separate cache item for each item we need to sync. That's doubling the effort required.

I'm totally with you on this point! Why metacache information can't be stored in a database (i.e. sqllite) instead of hundreds of thousands of tiny individual files on your HDD I have no idea!?! Surely a database storage solution would be far more efficient!?

Link to comment
Share on other sites

Sync right now can support up to a 1M files. However it will require significant memory for that. We understand how we could improve that but not there yet.

Memory usage for 400k files might be in range of 1Gb = which is a lot. We are working on improvement.

Please also note, that there is a virtual memory and real memory. Usually all memory tools shows you virtual memory.

Link to comment
Share on other sites

Thanks for the reply - glad to hear you are working on it:

Update: I removed most of the content, down to around 50k files, worked just fine (after i manually purged the cache).

FYI: I don't think it can actually (real-world) support 1M files, a lot went wrong (memory-wise) around the 300k mark - each time it froze things got worse (as the cache became out-of-sync in various ways). I have 8GB, memory usage was way past vmem 1GB, up to 2GB at times. Either way page faults galore, disk mashing, environment fail etc. Stopping the transfers by switching off the other client helped up to a point. The sync never succeeded (max transfer was 2GB) = 2 days and nights of trying.

As @kos13 mentioned:

You absolutely need to implement one of the many open source document (lighter, handle larger sets, faster, more and more reliable) or RDBMS storage methods for this. Many are light weight, tried and tested and will speed things up immensely. No point reinventing that wheel.

Best of luck, it's a great concept - thanks for your time

Cheers,

Frank.

Link to comment
Share on other sites

  • 2 months later...

I do agree that the current solution is not scalable for a huge number of files, I was having problem with that (high disk usage and trouble with Windows Index Service). However using a purely file system based store is the most compatible way to write any storage system. I would say that porting to Android is a priority, and leaving everything as a file system based store is the best option now.

Nevertheless, I do hope we have a faster way to index faster in the future.

Link to comment
Share on other sites

I do agree that the current solution is not scalable for a huge number of files, I was having problem with that (high disk usage and trouble with Windows Index Service). However using a purely file system based store is the most compatible way to write any storage system. I would say that porting to Android is a priority, and leaving everything as a file system based store is the best option now.

Nevertheless, I do hope we have a faster way to index faster in the future.

This old thread has been (until your post) dormant for over 2 months, and the information contained within it is now out dated!

Sync hasn't used the old "metacache" system for a while - "metacache" is NOT the "current solution" - it is no longer "file based" and indexing IS now "scalable" and much faster for a "huge number of files"... plus there IS an Android app!!

Please try to avoid digging up obsolete threads that are no longer relevant, and find links to the latest versions of Sync (including the Android app) in this thread.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.