I found out that there is no use of new String(command.getBytes(),"utf-8").
This isn't accurate. Below is an example showing different character sets (ASCII and UTF-8) to run the same command using exec()
, and the output is pretty clearly affected by the character set.
This program:
- takes a single input parameter,
- runs
touch
to create two files at /tmp/charset-test/
using that input value in the filename
- further, if the input is a UTF-8 value, it should create a file with the UTF-8 value in the filename
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class CharsetTest {
public static void main(String[] args) throws IOException {
String input = args[0];
System.out.println("input: " + input);
Charset[] charsets = {StandardCharsets.US_ASCII, StandardCharsets.UTF_8};
for (Charset charset : charsets) {
String command = "touch /tmp/charset-test/" + input + "-" + charset.toString() + ".txt";
System.out.println("command: " + command);
// this is identical to your code, but:
// - use Charsets instead of "utf-8" so I can interate; "utf-8" also works
// - skip assigning to "Process p"
Runtime.getRuntime().exec(new String[]{
"bash", "-c", new String(command.getBytes(), charset)
});
}
}
}
If I run with ASCII input "simple"
, it creates two files, one for each charset: "simple-US-ASCII.txt"
and "simple-UTF-8.txt"
. This isn't all that interesting, but shows both charsets work normally with basic (ASCII) input.
% rm /tmp/charset-test/*.txt && java CharsetTest.java simple
input: simple
command: touch /tmp/charset-test/simple-US-ASCII.txt
command: touch /tmp/charset-test/simple-UTF-8.txt
% ls /tmp/charset-test
simple-US-ASCII.txt simple-UTF-8.txt
If input changes to "我"
, then the ASCII charset handling results in the same "garbled" output you describe ("���-US-ASCII.txt"
), whereas the UTF-8 version looks good ("我-UTF-8.txt"
):
% rm /tmp/charset-test/*.txt && java CharsetTest.java 我
input: 我
command: touch /tmp/charset-test/我-US-ASCII.txt
command: touch /tmp/charset-test/我-UTF-8.txt
% ls /tmp/charset-test
我-UTF-8.txt ���-US-ASCII.txt
All of this to say: your code looks fine, it's doing the right thing to pass the charset to the Runtime.exec()
call. I can't say what the proper solution would be, but it's likely something with the environment (not your code).