Download files with non-english names

Report client bugs
Post Reply
yoavs
Posts: 8
Joined: Tue Feb 14, 2006 10:00 pm

Download files with non-english names

Post by yoavs »

Hi,
1. I'm downloading files with (non-english) right-to-left names (e.g. انتشارات.htm ). The file was uploaded correctly previously with coreftp client.
The download fails as follows:

TYPE A
200 Type set to A.
PASV
227 Entering Passive Mode (127,0,0,1,7,86).
RETR انتشارات.htm
Connect socket #12876 to 127.0.0.1, port 1878...
550 انتشارات.htm: The filename, directory name, or volume label syntax is incorrect.
انتشارات.htm - 0 bytes transferred
Transfer time: 00:00:01

2. Other files names, that where in the FTP Server, and not uploaded by the coreftp client are even not seen correctly. Bare in mind that my mission is to handle files that are in the FTP server from various sources.

Yoavs
yoavs
Posts: 8
Joined: Tue Feb 14, 2006 10:00 pm

Post by yoavs »

CP tnx for the quick reply!

I've selected view -> encoding -> unicode with no difference.
I'm using xp sp2 with appropriate regional settings (left-to-right support).

Tried the upload with both hebrew and arabic file names, from ftp core client with sucess. The download fails.

I think you can try it by uploading file with names like:
انتشارات.htm - in arabic; or
קובץ.html - in hebrew

and see it for your self
regards
Yoavs
yoavs
Posts: 8
Joined: Tue Feb 14, 2006 10:00 pm

Post by yoavs »

Hi,
for the arabic files

---------- 1 owner group 17088 Dec 12 2005 ????????.htm
d--------- 1 owner group 0 Feb 17 12:46 ????????_files
-rwxrwxrwx 1 owner group 17088 Feb 17 23:06 انتشارات.htm
-rwxrwxrwx 1 owner group 0 Dec 13 2005 EOF.TXT

The upper two files were not uploaded by core FTP (but it's arabic)
The third file was uploaded by core ftp. After the successful upload I cannot download it again.

Regards Yoavs
yoavs
Posts: 8
Joined: Tue Feb 14, 2006 10:00 pm

how to unicode the FTP

Post by yoavs »

Well it's a problem...
After some disscusions with microsoft (which we use their FTP server) there is support only to one additional unicode language (which you have to use in the regional setting unicode settings).
So if you are handle all sorts of file names from different languages (>1) you have to think of another solution.
The best solution we found is to "hex" the file\dir names before download and to "deHex" it after download. These actions ensure that we don't care of the file/dir language. The only pitfall is that hex being a double-digit representation enables to transform only up tp 128 chars while windows OS allows up tp 255 chars.
I hope this will help those who have the same problem.

Note: in different FTP servers there may be other solutions like using another default code page in the server (i.e. IBM)

Yoavs
Post Reply