Linux/Unix symlink


Recommended Posts

Hi RomanZ,

these are my opinion, so this is only a point of view. It's up to BitTorrent to decide if it's useful or not to follow SymLink, from my point of view they are useful (my home in linux system is a simple symlink to a different filesystem because /root  disk was too small).


"people want symlinks to point outside of the sync folder (main usage scenario - accumulating all backup data in one folder)."

Yes this is one of my need.

-  The option in rsynch  copy-unsafe-links could be a solution, may be it's not useful to implement the safe option.

This tells rsync to copy the referent of symbolic links that point outside the copied tree



1) Symlinks are ruining tree-shape of the directory structure. They can wrap directory structure into itself.

-- If a user use symlinks and create recurse subdir in itself   (if i understand correclty the statement "wrap directory structure into itself") i think there is no problem fot BtSync, user have to correct the problem they create, it' not a BtSync problem. May be BtSynch can check when it happen and give a warning or refuse to synch the dir (some command in Linux give such a warning may be the find command or similar don't remember)


2) Symlinks can drive BTSync out of directory (and store in general much more information than user expect)

-- Again if a user create a symlink thath point to an external Path/Storage/Directory/File System or similar from my point of view it's not a problem. I do this when i need it


3) Symlinks can bring BTSync to some network location which contains more info than expected

-- Missing the meaning (my english is not perfect :( )


4) Symlinks can bring BTSync to a filesystem which is not compatible with current FS. 

-- This could be a problem, but i don't know the full btsync implementation. Again it's a user decision, but if i understand right could lead to some problem in BTsync? Permission or other type of problem? May be a warning or refuse to synch could resolve the problem?


5) <probably some other peculiarities which i'm missing right now, but you also can guess>

Yes i agree this could cause a more complex implementation , testing and so on , so again it's useful for me but this is only my point of view.


My 0.0002 cents :)!

Link to post
Share on other sites
  • 4 weeks later...

I have managed to hack a script together which manages this this quite well. It should work on any Unix platform but on Mac OS you'll need to compile the hlink and hunlink programs in order for it to work. If you are not on Mac OS, you'll need to edit the script to use "ln -d" to hard link the directories.


I am using this to manage the music I wish to sync to my phone, as I don't think the manual selection in the android app is very good.

#!/bin/bashcd "/Volumes/Marceline/MusicSync/"# first unlink files which have been removedfor artist in *; do    if [ -d "$artist" ]; then      if ! grep -Fxq "$artist" "links" ; then        echo Unlinking $artist        hulink "$artist"      fi    fidone# link any new oneswhile read artist; do  if [ ! -d "$artist" ]; then    echo Linking $artist    hlink "../Music/$artist" "$artist"  fidone < links

It needs a file, in this case links to contain a list of the directories you want linked up. Example:

Bright EyesBryan John ApplebyConor Oberst & The Mystic Valley BandDeath Cab For CutieEdward Sharpe & The Magnetic ZerosFeist & Ben Gibbard

It seems to work pretty well for me but obviously I have no guarantees it'll work for you. 



This address a request of me: I would like to have only a subset of all my MP3 on my mobil.

Link to post
Share on other sites
  • 4 months later...

Agreed! At the end of the day, sync isn't like DropBox or OneDrive (SkyDrive), etc that limit you to only syncing one central folder - in these instances, I can understand why SymLinks would be very useful to allow you to sync the content of other folders too!


But given that BitTorrent Sync allows you to select and sync ANY folder, surely the need for using SymLinks with Sync is far less than if you're using DropBox/OneDrive, etc, as you can add all the folders you wish to Sync themselves directly?!


I suspect that's probably why - although better handling SymLinks is in the development "queue" - it's currently not considered the "highest priority" for the team.


In my situation I have a fileserver with a single hidden "master" folder with all my media in, I then have a custom setup to map these files into different categorised folders on a public folder elsewhere on the disk


For example

+ .media (hidden)----+ music        + artist 1        + artist 2        + artist 3+ media----+ playlists--------+ rock             + artist 1             + artist 3--------+ other             + artist 1 song 1             + artist 2

It allows me to reference the same file in multiple different locations without having to physically copy the file (I have a much more complicated structure than the one mentioned above which allows me to group alphabetically/genre etc etc)


So my idea scenario would I would sync a 'playlist' folder with my phone and it would sync the end files of the symlink so my phone always has the most up-to date copy of the folder.


I'm by no means an expert with unix systems so I wouldn't be able to comment on specific work arounds or functionality, but thats how I (as a layman) would expect it to work


(sorry for such a late reply, follow topic was turned off by default)

Link to post
Share on other sites

@scruffyfox & @all,


There are several peculiarities around going over symlinks as over folders. Let me make a list.

1) Symlinks are ruining tree-shape of the directory structure. They can wrap directory structure into itself.


2) Symlinks can drive BTSync out of directory (and store in general much more information than user expect)


How do you expect (or find useful) for BTSync to deal with such issues?


OK. I'll give it a shot to explain. Details details.





To begin with, with we may wish to process any symlinks in it's own separate 2nd pass. After all of the existing main part of the sync task was normally completed. That way, if symlinks aren't globally enabled in the settings option, there will be no kind of performance penalties or run-time changes occuring whatsoever. Keeping the existing program code unchanged to avoid introducing any new bugs to the existing (good, and working) software.


Another early thing you can do (while still releasing betas) and to help include this feature only in the Advanced Settings tab but always mark underneath it a red text saying that the Symlink feature is currently experimental. This allows some users to begin using it and help testing even if you are not 100% fully done. Because obviously from what you are saying there may be many possible error cases when they point to un-supported filesystems / network shares. And you simply cannot ever fully test and predict to know the resulting behaviour. Those are not-clearly defined situations.


Anyway. So let us assume in the btsync program (all platforms) we have coming from main() side some recursion mechanism that examines each new file / modified and then decides what btsync is supposed to do with it. That can eventually be part of the main sync task. Or for initially and beta version, as above ^^ suggested for stability we might initially assume to put all the new symlink work in some separate "after-task". Which is processed as it's own full separate recursion loop at the end of everything else being synced for that 1 bt-sync-folder.


Maybe to save doing everything twice. Then perhaps the first main() loop will remember and note the existence of a symlink by adding it to a new and separate list or array to be processed again at the end (uses more memory). Note that at this stage, no symlinks have been followed whatsoever. So no circular-sylminks situation exists yet to for be anything to be worry about. So that is the top-down angle now already we have started to be given some ideas about it. Of course, that is some thing else entirely to actually go into the existing source code and translate all of those existing loops and functions which need to be modified / added to. And that is all towards the main structural component to support our new symlinking task. Drilling from top-downwards.



The 2nd component comes from the reverse side. This is a bottom-up perspective.


Let us imagine we must write 1 small function that operates only on 1 individual symlink encountered. So this function takes as it's argument 1 symlink path that we know must reside inside the sync folder path. The purpose of the function is to expand or glob and resolve to all of the paths and files which are represented behind that symlink.


This is where each symlink must be processed and resolved by some kinds of dynamic criteria. Either the symlink is a full path (static). Or it is dynamic. We would imagine if the symlink is dynamic and it's path depth is not sufficient to break out of the enclosing sync folder (for example path/to/symlink --> ../../dir. Then we should probably not alter the symlink in any way as it is self-contained. So instead just copy the symlink file verbatim ^1. Or since the file it is pointing to is already synced then make a local copy of that file's / blob / hash id on the target machine^2.


If the path resolves outside the enclosing sync folder, then we could just do the same thing anyway. And treat any lack of resulting synced files entirely as user error^3. As is done in many other kinds of situations with posix systems.


If the symlink is an absolute path, same again. Does it point to a file that has already been synced? If yes then we may translate the symlink path to another absolute path on the target system^4. Then either the target file / blob / hash is already transferred and it is make as a hard copy. Or the symlink is altered to point to the correct file^5.


Or (for full-path symlinks), we could instead just literally interpret all full-path symlinks to resolve to the target file always^6 - that is also useful in some cases and situations on posix compliant operating systems.


So we either are clever about it and dynamically determine which symlink files to copy literally with an if-already-synced algorithm ^1-6. Or we always copy all symlinks literally as symlinks (no interpretation whatsoever)^7. Or we always copy all symlinks as files (no interpretation whatsoever)^8.


So, if you want global symlinking options. I would suggest to present the user with no more than 3 possible options. Which are 1) "always copy symlinks as symlinks". 2) "automatically resolve symlinks as best as possible". And 3) "always copy symlinks as files".


Where option 2) is a combination of ^1 + ^3 + ^6 above. (or if you want to be really fancy then ^4 instead of ^6. However for a WINDOWS sync target then option 2) can't be translated onto its native NTFS filesystem safely, so the sync must result in being 3) instead.


If you want to give the user any more finer-grained control than the global application setting. Then I would suggest you present to  the user the same 3 options again, but on a per-sync-folder-basis. Anything else (more features than that) may be too complex for users to understand.



3rd, we have to re-consider mechanisms for how to ensure that the system will not be confused by the different files and not to sync-back and overwrite the symlinks with regular files coming back from the sync target. There are a variety of schemes that may be employed depending entirely upon the mechanism(s) already in use by the existing implementation. For example last sync time stamp might have been used to determine stale-ness. Or the blob / hash may need to be overlaid with a new mask that does not collide (to mark it as a symlink). Or a separate symlink metadata attribute. Since the regular matching algorithm would no longer work for those files. It is hard to know what sort of work would be involved there. But this is also a very important part of the implementation.



4th, we have to re-consider how to resolve circular symlinks when hard-copying files. The situation only occurs for user setting 2) and 3). How to resolving the list of files contained in the path of a symlink folder that contains other symlink folders within it. Again there are a variety of solutions depending how you wish to deal with the matter. However if your pre-existing file syncing mechanism was well written properly and is blob / hash / based then at least no new or additional should files need to be transferred across the network. Because a circular reference just contains all of the same files that were already synced previously. Best solution is probably just to detect any circulars immediately and just not to copy them at all past depth1. Then there are no duplicate files whatsoever. if the target machine is posix then (first inside the circular) you may just copy the literal symlink file. Else copy nothing at all. However I recommend to instead place a consistent error file. Which may be something like a small template text file that says inside it "This file could not be copied due to a circular dependancy". You may if you wish include supplementary information in the text file for the use to be able to manually trace or reconstruct where the path would have pointed to (because that original folder is already guaranteed to exist by the end of the same task). Again, You only need to put that at the first repeated depth=1 of the cycle. It would then only ever replace a symlink itself.


So that is a feature spec document. Assign relevant text labels to these different kinds / categories of features and problems to overcome. The group marked Nrd = {1st, 2nd, 3rd, 4th}. The group marked ^N = { ^1, ^2…. ^8 }. And the group marked N) = { 1), 2), 3) }.


All of that together would definitely be as 1 major feature. But if doing some early investigating of the code. Then can first of all be broken down into several such smaller chunks. No harm in that. Probably some interesting technical challenges it would make.

Link to post
Share on other sites


This topic is now archived and is closed to further replies.