10

I'm using Windows 10 and curl 7.52.1. When I try to POST data to a WEBSERVICE, curl isn't encoding the characters to UTF-8 (I need to display pt-BR characters, like àáçÇãõ etc)

Yes, I have already checked this, no success.

If I set the encoding page to chcp 65001, the error persists. Changing to chcp 1252 solved the problem partially.

Look, if I prompt echo Administração >> test.txt without any chcp change, I get an Administra‡Æo.

After change to chcp 65001 I get Administração.

After change to chcp 1252 I finally get Administração.

But using curl, nothing change.

I've tried setting a header content-type, no lucky:

curl -X POST -h "Content-Type: text/plain; charset=UTF-8" --data-ascii "name=Administração" http//:localhost:8084/ws/departments

I get the following output:

{"holder":{"entities":[{"name":"Administra��o","dateReg":"Dec 29, 2016 2:05:33 PM"}],"sm":{}},"message":{"text":""},"status":-1}

I have also checked the WS it's accepting the characters encoding, when I run (in JQuery):

$.ajax({
     url:"http://localhost:8084/ws/departments",
     type:"POST",
     data: {name: "Administração"},
     success: function(data, textStatus, xhr){
       console.log(data);
     }
});

I get the output expected:

{"holder":{"entities":[{"name":"Administração","dateReg":"Dec 29, 2016 2:03:17 PM"}],"sm":{}},"message":{"text":""},"status":-1}

I don't know what else can I try to solve this. Please, could you guys help me?

Thanks in advance.

UPDATE

As suggested by @Dekel, I tried also using an external file as data-bynary (the content inside test.txt is name=Administração):

curl -i -X POST -H "Content-Type: text/plain; charset=UTF-8" --data-binary "@test.txt" http://localhost:8084/ws/departments

I still get this unusual output:

**{"holder":{"entities":[{"name":"Administra��o","dateReg":"Dec 29, 2016 2:41:27 PM"}],"sm":{}},"message":{"text":""},"status":-1}**

UPDATE 2

@Phylogenesis suggested to use charset=ISO-8859-1. I noticed that even returning Administração as result, checking narrowly in the server-side, the WS is receiving the exact letter, in this case ç.

Community
  • 1
  • 1
Andrew Ribeiro
  • 666
  • 1
  • 7
  • 18
  • Use binary data (and make sure the file's encoding is correct) http://stackoverflow.com/questions/6408904/send-post-request-with-data-specified-in-file-via-curl – Dekel Dec 29 '16 at 16:15
  • What happens if you use `charset=ISO-8859-1`? – Phylogenesis Dec 29 '16 at 16:21
  • @Dekel, echoing to a file is just an example, in fact I would like to POST data as a html form (x-www-form-urlencoded). – Andrew Ribeiro Dec 29 '16 at 16:33
  • I didn't mean to echo to a file. Use the file as the input (not as output). This way you have full control on the encoding (and you are not tied to the encoding of your console). – Dekel Dec 29 '16 at 16:34
  • @Phylogenesis, running `curl -X POST -H "Content-Type: text/plain; charset=ISO-8859-1" --data-ascii "name=Administração" http://localhost:8084/ws/departments` I get: `{"holder":{"entities":[{"name":"Administração","dateReg":"Dec 29, 2016 2:34:19 PM"}],"sm":{}},"message":{"text":""},"status":-1}`. The character changed but still not showing the exact word, in this case `ç`. – Andrew Ribeiro Dec 29 '16 at 16:36
  • @Dekel, I will try it and give a feedback. – Andrew Ribeiro Dec 29 '16 at 16:37
  • @AndrewRibeiro how did you save the test.txt file? what is the encoding of it's content? – Dekel Dec 29 '16 at 16:57
  • @Dekel, just create using standard windows file creation: mouse right click > new text file > type name=Administração > save file > close. – Andrew Ribeiro Dec 29 '16 at 17:05
  • And how do you know that the content of the file is encoded in UTF8? :) Open the file in a *real* text editor (you can use notepad++ for example), just don't use windows' notepad, and check the file's encoding. – Dekel Dec 29 '16 at 17:06
  • btw, you can use https://httpbin.org/ for your tests. POST anything you want to `https://httpbin.org/post` and you will receive the exact data you just posted. – Dekel Dec 29 '16 at 17:07
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/131826/discussion-between-andrew-ribeiro-and-dekel). – Andrew Ribeiro Dec 29 '16 at 17:14

1 Answers1

17

After a discussion with @Dekel and a suggestion coming from @Phylogenesis I could resolve the problem partially, but effectively. There are 2 ways:

  • charset=ISO-8859-1
  • encoding a file and sending as binary-data

The server could receive the correct letter using charset=ISO-8859-1. Even the response data from server showing incorrectly.

I use: curl -i -X POST -H "Content-Type: text/plain; charset=ISO-8859-1" --data-ascii "name=Administração" http://localhost:8084/ws/departments

The second way is encoding a file containing all the content you want to POST. I used Notepad++ > Format > Convert to UTF-8 (Without BOM).

Then, prompt: curl -i -X POST -H "Content-Type: text/plain; charset=UTF-8" --data-binary "@test.txt" http://localhost:8084/ws/departments.

Andrew Ribeiro
  • 666
  • 1
  • 7
  • 18