5

Simply putted I can't download files that are hosted in my web server if they have special characters in the filename because I get 404. enter image description here

If I create a file called olá.txt I don't seem to find the correct URL to download it. I've tried all possible ways to download it:

mydomain.com/olá.txt 
mydomain.com/ol%C3%A1.txt

and I always get a 404 from Apache Tomcat 7.0.3, but if I change the file name to ola.txt everything is fine.

I've added AddDefaultCharset utf-8 to the httpd.conf but I still have the issue.

I mean it should be possible to download files with names containing non ascii characters, right?

Update: My server.xml has:

<Connector URIEncoding="UTF-8" compressableMimeType="text/javascript,text/css" 
     compression="on" compressionMinSize="2048" connectionTimeout="20000"
     noCompressionUserAgents="gozilla, travista" port="8080"
     protocol="HTTP/1.1" redirectPort="8443"/>

Update:

echo -n olá | od -An -tx1 =  6f 6c c3 a1
echo $LANG = en_US.UTF-8

locale:

LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE=en_US.UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
out_sid3r
  • 1,018
  • 2
  • 20
  • 42

2 Answers2

2

You might need to add this to the <connector ... /> tag in your server.xml for Tomcat:

URIEncoding="UTF-8"

More info:

How to get UTF-8 working in Java webapps?

utf-8 url problem


I've bee having trouble reproducing this on my end. I've done a clean install of Tomcat 7.0.26 on Ubuntu 12.04.4 LTS, created /var/lib/tomcat7/webapps/ROOT/testé.txt, and successfully served that file to my browser at the url http://localhost:8080/testé.txt.

This is my connector tag in /etc/tomcat7/server.xml:

<Connector port="8080" protocol="HTTP/1.1"
           connectionTimeout="20000"
           URIEncoding="UTF-8"
           redirectPort="8443" />

I can't say why yours isn't working, at this time, but I can at least confirm that serving UTF-8 encoded files with tomcat7 is possible.

Community
  • 1
  • 1
awiseman
  • 5,539
  • 1
  • 18
  • 15
  • (question updated) I think it's encoding, I mean at least I have it on the server.xml and the 404 error image gives the utf8 encoded name, right? – out_sid3r Mar 28 '14 at 13:29
  • edited answer to respond to the new info in your question. hopefully this is just a simple typo fix for ya. – awiseman Mar 28 '14 at 13:34
  • Sorry but the xml had the correct tag it was my mistake pasting here – out_sid3r Mar 28 '14 at 15:53
  • Sorry I haven't been able to help you yet. What linux distro are you running on? What do you get from `echo $LANG`? What tool (browser or otherwise) are you using to make the http request? – awiseman Mar 28 '14 at 22:53
  • the xml code I updated my question with is inside a , should it be a direct child of the xml element declaration? – out_sid3r Mar 31 '14 at 17:19
0

The problem might have nothing to do with Tomcat or URL encoding, and it could actually be a problem with the encoding of the FTP connection (or whatever you are using to send the files to the remote host).

If they differ, you'd send a file you'd see as "testé", and asking it back from the same source would of course return "testé". But on the file system the file might be encoded differently (even if LANG is properly set).

Try creating the file from Tomcat and requesting it in UTF8 and url-encoded forms. If it works, then try looking at the file name from your FTP client.

LSerni
  • 55,617
  • 10
  • 65
  • 107
  • 1
    I've actually tested by creating the files through "vi": vi testé.txt and I can't download it but if I do: mv testé.txt teste.txt it works...do you think the issue might be the same? – out_sid3r Apr 04 '14 at 12:35
  • Such test might be inconclusive. You have to create it with Tomcat. Even a simple execution of `touch /full/path/to/testé.txt`, if done by the Tomcat process, could tell us whether this is the case. **I was able to reproduce the same symptoms on my system** but I had to use PuTTY on Windows 7 on the client and a Linux OpenSuSE 13.1 on the server. – LSerni Apr 04 '14 at 14:03
  • The root user seems to be tomcat user, at least when I do: ps auxwww | grep -v grep | grep tomcat it gives root 2321 ... Tomcat Servlet Container – out_sid3r Apr 05 '14 at 18:22
  • Okay. But can you try and create the file *from the webapp*? That would be a conclusive check, by verifying where the file is created and how it appears named. – LSerni Apr 05 '14 at 20:12