Sign in to follow this  
binarybrian

[Solved] File Names With Korean Characters Are Skipped As Invalid Utf8

Recommended Posts

In bittorrent sync version 1.4.75 and 1.4.83 any file names containing Korean characters will not sync.  The log files on both machines indicate that the files are skipped due to an "invalid UTF8 file name".  

This problem did not exist with version 1.3.109.  The problem seems to be unique to Korean characters.  I've tried both Chinese and Japanese and files with these characters synchronize properly.

 

System Information:

Gentoo Linux

Kernel Version 3.14.14

File System: ext4

 

cat sync.log |grep -i version

version: 1.4.83

 

The relevant section of the sync.log file is shown below.

[20140918 22:35:30.401] SF[9811]: UpdatePeersStat[20140918 22:35:30.401] SF[9811] [6F79]: up:0 down:0[20140918 22:35:30.401] ScheduledTask:UpdatePeersStat invoked:timer reason:FinishStateSync[20140918 22:35:35.989] SyncFolderNotify: "감사합니다.txt", event = "IN_CREATE"[20140918 22:35:35.989] [OnNotifyFileChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:35.989] SyncFolderNotify: "감사합니다.txt", event = "IN_ATTRIB"[20140918 22:35:35.989] [OnNotifyAttrChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:35.989] [OnNotifyFileChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:35.989] SyncFolderNotify: "감사합니다.txt", event = "IN_CLOSE_WRITE"[20140918 22:35:35.989] [OnNotifyFileChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:36.647] SyncFolderScanner: Processing watch item "/mnt/teragon/K-Sync/test/감사합니다.txt", now = 1411101336, time = 0[20140918 22:35:36.648] SyncFolderScanner: Posting update event for file "/mnt/teragon/K-Sync/test/감사합니다.txt"[20140918 22:35:36.648] SyncFolderScanner: Skiping file '/mnt/teragon/K-Sync/test/감사합니다.txt' with invalid UTF8 file name[20140918 22:35:36.648] FC[9811]: attr changed - rocessing folder /mnt/teragon/K-Sync/test/감사합니다.txt 1411101335[20140918 22:35:36.648] [OnNotifyFileChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:37.648] SyncFolderScanner: Processing watch item "/mnt/teragon/K-Sync/test/감사합니다.txt", now = 1411101337, time = 1411101327[20140918 22:35:37.648] SyncFolderScanner: Posting update event for file "/mnt/teragon/K-Sync/test/감사합니다.txt"[20140918 22:35:37.648] SyncFolderScanner: Skiping file '/mnt/teragon/K-Sync/test/감사합니다.txt' with invalid UTF8 file name[20140918 22:35:42.146] SyncFolderNotify: "감사합니다.txt", event = "IN_CLOSE_WRITE"[20140918 22:35:42.146] [OnNotifyFileChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:42.648] SyncFolderScanner: Processing watch item "/mnt/teragon/K-Sync/test/감사합니다.txt", now = 1411101342, time = 1411101333[20140918 22:35:42.649] FC[9811]: file changed - processing file "/mnt/teragon/K-Sync/test/감사합니다.txt"[20140918 22:35:43.649] SyncFolderScanner: Processing watch item "/mnt/teragon/K-Sync/test/감사합니다.txt", now = 1411101343, time = 1411101333[20140918 22:35:43.649] SyncFolderScanner: Posting update event for file "/mnt/teragon/K-Sync/test/감사합니다.txt"[20140918 22:35:43.649] SyncFolderScanner: Skiping file '/mnt/teragon/K-Sync/test/감사합니다.txt' with invalid UTF8 file name[20140918 22:35:48.024] SyncFolderNotify: "감사합니다.txt", event = "IN_MODIFY"[20140918 22:35:48.024] [OnNotifyFileChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:48.024] SyncFolderNotify: "감사합니다.txt", event = "IN_MODIFY"[20140918 22:35:48.024] [OnNotifyFileChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:48.024] SyncFolderNotify: "감사합니다.txt", event = "IN_CLOSE_WRITE"[20140918 22:35:48.024] [OnNotifyFileChange] "/mnt/teragon/K-Sync/test/감사합니다.txt", source = "NULL"[20140918 22:35:48.649] SyncFolderScanner: Processing watch item "/mnt/teragon/K-Sync/test/감사합니다.txt", now = 1411101347, time = 1411101341[20140918 22:35:48.650] FC[9811]: file changed - processing file "/mnt/teragon/K-Sync/test/감사합니다.txt"[20140918 22:35:49.650] SyncFolderScanner: Processing watch item "/mnt/teragon/K-Sync/test/감사합니다.txt", now = 1411101348, time = 1411101341[20140918 22:35:50.650] SyncFolderScanner: Processing watch item "/mnt/teragon/K-Sync/test/감사합니다.txt", now = 1411101349, time = 1411101341[20140918 22:35:51.650] SyncFolderScanner: Processing watch item "/mnt/teragon/K-Sync/test/감사합니다.txt", now = 1411101351, time = 1411101341[20140918 22:35:51.650] SyncFolderScanner: Posting update event for file "/mnt/teragon/K-Sync/test/감사합니다.txt"[20140918 22:35:51.650] SyncFolderScanner: Skiping file '/mnt/teragon/K-Sync/test/감사합니다.txt' with invalid UTF8 file name

Steps to reproduce:

1. From the web UI on two different machines choose a folder to synchronize.

2. Go https://translate.google.com/?hl=en#en/ko/Thank%20You to generate some Korean characters. 3. Open a terminal and cd to the shared folder on Linux machine Alice. 

4. Type "touch 감사합니다.txt"

5. Type "touch hello.txt"

6. Wait 5 minutes and observe on Linux machine Bob that the 'hello.txt' file has arrived but that the file with Korean characters has not been received.

7.  Open the sync.log on Alice and look for the line "Skiping file '감사합니다.txt' with invalid UTF8 file name".

8.  Bonus observation: 'Skiping' is spelled wrong in the log file.

 

Expected result:  Files with Korean characters are not skipped.

Share this post


Link to post
Share on other sites

Hi all,

 

Could you please run a command in console and send me and output?

ls | iconv --from utf8

I suspect that you might have incorrectly encoded filenames in your foler. Please send output to syncapp@bittorrent.com and refer this topic and my name in message subject.

 

Thanks!

Share this post


Link to post
Share on other sites

Hi, I have exactly same problem on Ubuntu 14.04

 

BTSync Version 1.4.83 (1.4.83)

 

On Mac OSX and Windows no issue with Korean(Hangul), but it only occurs on Ubuntu Korean filename and foldername.

 

Please fix it for korean Linux Users! I'm very annoying with it.

 

I don't understand that there is no problem with Japanese and Chinese, but only Korean.


akntk@umi:~/.btsync$ 
akntk@umi:~/.btsync$ cat sync.log |grep -i version
version: 1.4.83
[20141001 04:29:03.672] Loading config file version 1.4.83
akntk@umi:~/.btsync$ ls | iconv --from utf8
18D4AD97B00AD6FEE292B8474E45DD3D7B3191E1.db
18D4AD97B00AD6FEE292B8474E45DD3D7B3191E1.db-shm
18D4AD97B00AD6FEE292B8474E45DD3D7B3191E1.db-wal
btsync-gui.log
history.dat
history.dat.old
settings.dat
settings.dat.old
sync.dat
sync.dat.old
sync.lng
sync.log
sync.log.bak
webui.zip
akntk@umi:~/.btsync$ 

uname -a
Linux umi 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Share this post


Link to post
Share on other sites

@RomanZ Thanks for the confirm. I downgraded to 1.3.109 all of my machine include Mac Linux Windows. The older version has no problem yet.


@RomanZ

FYI, with ver 1.4.83 the Korean issue is not only in Linux, but also Windows and Mac. Thanks for your time and effort.

Share this post


Link to post
Share on other sites

After upgrading to 1.4.91 on my x64 Ubuntu Linux box, I get the same error messages, and none of my files containing Hungarian UTF-8 characters is synced to other devices/computers. Reverting to 1.4.82 fixes the problem.

Share this post


Link to post
Share on other sites

@all

This is known issue. It affects all files with non-ASCII characters in the filename, Linux computers only. Please expect fix in next build (sorry - no ETA).

Share this post


Link to post
Share on other sites

Confirmed the same problem with Cyrillic:

[20141030 14:21:52.291] SyncFolderScanner: Skiping file '/home/bomfunk/Docs/bomsync/русский тест.txt' with invalid UTF8 file name

Using only Linux x64 systems to sync files with Russian names. The version is the latest available - 1.4.93.

I didn't experience this problem before the few latest updates, the application seemed to be pretty reliable.

 

Hoping that the fix will come soon.

Share this post


Link to post
Share on other sites

@all

As has now been mentioned numerous times; This is a known issue affecting files with non-ASCII characters in their filenames on Linux computers only.

 

As RomanZ indicated just yesterday, "Please expect fix in next build (sorry - no ETA)"

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this