6

I have a Python file I'm calling with PHP's exec function. Python then outputs a string (apparently Unicode, based on using isinstance), which is echoed by PHP. The problem I'm running into is that if my string has any special characters in it (like the degree symbol), it won't output. I'm sure I need to do something to fiddle with the encoding, but I'm not really sure what to do, and why.

EDIT: To get an idea of how I am calling exec, please see the following code snippet:

$tables = shell_exec('/s/python-2.6.2/bin/python2.6 getWikitables.py '.$title);

Python properly outputs the string when I call getWikitables.py by itself.

EDIT: It definitely seems to be something either on the Python end, or in transmitting the results. When I run strlen on the returned values in PHP, I get 0. Can exec only accept a certain type of encoding?

cryptic_star
  • 1,863
  • 3
  • 26
  • 47

4 Answers4

12

Try setting the LANG environment variable immediately before executing the Python script per http://php.net/shell-exec#85095:

shell_exec(sprintf(
  'LANG=en_US.utf-8; /s/python-2.6.2/bin/python2.6 getWikitables.py %s',
    escapeshellarg($title)
));

(use of sprintf() to (hopefully) make it a little easier to follow the lengthy string)

You might also/instead need to do this before calling shell_exec(), per http://php.net/shell-exec#78279:

$locale = 'en_US.utf-8';
setlocale(LC_ALL, $locale);
putenv('LC_ALL='.$locale);
2

I have had a similar issue and solved it with the following. I don't understand why it is necessary, since I though all is already processed with UTF-8. Calling my Python script on the command line worked, but not with exec (shell_exec) via PHP and Apache.

According to a php forum entry this one is needed when you want to use escapeshellarg():

setlocale(LC_CTYPE, "en_US.UTF-8");

It needs to be called before escapeshellarg() is executed. Also, it was necessary to set a certain Python environment variable before the exec command (found an unrelated hint here):

putenv("PYTHONIOENCODING=utf-8");

My Python script evaluated the arguments like this:

sys.argv[1].decode("utf-8")

(Hint: That was required because I use a library to convert some arabic texts.)

So finally, I could imagine that the original question could be solved this way:

setlocale(LC_CTYPE, "en_US.UTF-8");
putenv("PYTHONIOENCODING=utf-8");
$tables = shell_exec('/s/python-2.6.2/bin/python2.6 getWikitables.py ' .
          escapeshellarg($title));

But I cannot tell anything regarding the return value. In my case I could output it to the browser directly without any problems.

Spent many, many hours to find that out... One of the situations when I hate my job ;-)

robsch
  • 9,358
  • 9
  • 63
  • 104
0

This worked for me

setlocale(LC_CTYPE, "en_US.UTF-8");
putenv("PYTHONIOENCODING=utf-8");
$tables = shell_exec('/s/python-2.6.2/bin/python2.6 getWikitables.py ' .
      escapeshellarg($title));
  • This does not really answer the question. If you have a different question, you can ask it by clicking [Ask Question](https://stackoverflow.com/questions/ask). To get notified when this question gets new answers, you can [follow this question](https://meta.stackexchange.com/q/345661). Once you have enough [reputation](https://stackoverflow.com/help/whats-reputation), you can also [add a bounty](https://stackoverflow.com/help/privileges/set-bounties) to draw more attention to this question. – StevenSiebert Sep 02 '21 at 08:41
-1

On php you can use methods like utf8_encode() or utf8_decode() to solve your problem.

David Rodrigues
  • 12,041
  • 16
  • 62
  • 90
  • Neither worked - is it possible the problem is in transmitting from Python to php and I need to do something on the Python side? – cryptic_star May 19 '11 at 22:39
  • Maybe, or can be the php output charset. Try it `header('Content-type: text/html; charset=utf-8');` and tell me if you get a different result. – David Rodrigues May 19 '11 at 22:43
  • You need take a look on Multibyte library of PHP. This can tell you what is the returned encoding, and you can turn it to utf-8 and make _visible_. – David Rodrigues May 20 '11 at 01:34