3

I have input String '~|~' as the delimiter.

For example: String s = "1~|~Vijay~|~25~|~Pune"; when I am splitting it with '~\\|~' in Java it is working fine.

String sa[] = s.split("~\\|~", -1);
for(String str : sa) {
    System.out.println(str);
}

I am getting the below output.

1
Vijay
25
Pune

When the same program I am running by passing a command-line argument('~\\|~'). It is not properly parsing the string and giving it below output.

1
|
Vijay
|
25
|
Pune

Is anyone else facing the same issue? please comment on this issue.

jps
  • 20,041
  • 15
  • 75
  • 79
Vijay_Shinde
  • 1,332
  • 2
  • 17
  • 38

3 Answers3

5

You only need a single backslash when running it from the command line. The reason you need two when making the regular expression in Java is that backslash is used to escape the next character in a string literal or start an escape sequence so one backslash is needed to escape the next one in order for it to be interpreted literally.

~\|~
Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
Unmitigated
  • 76,500
  • 11
  • 62
  • 80
3

Please, do a System.out.println("[" + args[i] + "]"); to see what java is receiving from the command line, as the \ character is special for the shell and aso are the | and ~ chars (the last one expands to your home directory, which could be a problem)

You need to pass:

java foo_bar '~\|~'

(Java still needs a single \ this time to escape the vertical bar, as you are not writing a string literal for the java compiler but a simple string representing the internal representation of the above string literal, the \ character doesn't need to be escaped, as it is inside single quotes so it is passed directly to the java program) Any quoting (single or double quotes) suffices to avoid ~ expansion.

If you are passing

java foo_bar '~\\|~'

the shell will not assume the \ as a escaping character and will pass the equivalent to this String literal:

String sa[] = s.split("~\\\\|~", -1); /* to escapes mean a literal escape */

(see that now the vertical bar doesn't have its special significance)

...which is far different (you meant this time: split on one ~\ sequence, this is, a ~ followed by a backslash, or just a single ~ character, and as there are no ~s followed by a backslash, the second option was used. You should get:

1
|
Vijay
|
25
|
Pune

Which is the output you post.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
1

You don't have to escape:

public static void main(String[] args) {
    Pattern p = Pattern.compile(args[0], Pattern.LITERAL);
    final String[] result = p.split("1~|~Vijay~|~25~|~Pune");
    Arrays.stream(result).forEach(System.out::println);
}

Running:

javac Main.java
java Main "~|~"

Output:

1
Vijay
25
Pune

Where args[0] is equal to ~|~ (no escaping). The trick is that pattern flag, Pattern.LITERAL, which treats every character, including |, as normal character, ignoring their meta meaning.

Silviu Burcea
  • 5,103
  • 1
  • 29
  • 43
  • This answer should be correct, if it addressed the actual problem. The PO is asking why the parameters from the command line are not working. If the shell is changing the string to pass the parameter to the java program, there's nothing in your answer that points to the actual problem. – Luis Colorado Jul 08 '20 at 03:47
  • @LuisColorado Thanks for feedback. Whilst I haven't explained why passing `~\\|~` doesn't work as expected, I think I'm offering a solution to the problem, which is splitting a text by a delimiter passed when invoking from command line. On top of that, I have explained how my solution works. I believe it's the more appropriate way to split a text by an arbitrary delimiter because you could have anything there and it won't be interpreted as meta chars. With that said, no, I won't explain why double escaping doesn't work, that is probably answered on SO already. – Silviu Burcea Jul 08 '20 at 08:15
  • Sorry, the question was not _give me a working solution to this_, but _how is this not working?_ What if the PO is forced to use a regexp (e.g. `" *, *"`) in the future? You propose a completely different approach, but the target is to respond the question, not to give a solution and go to another task. If you don't know the answer to the question, then it is better no response at all, as this is a sharing knowledge place **to learn**, not a _"try and see if"_ place. I appreciate your interest in justifying your answer, but IMHO, you have not addressed the problem. – Luis Colorado Jul 10 '20 at 06:49
  • @LuisColorado the question is how to split on separator, which clearly says the intent, give me an array of strings which are delimited by these chars. It doesn't say a word about RegExp and OP wouldn't have faced this issue if he was using just `~~~` as separator. It's just bad luck that Java doesn't have a `String#split` method which accepts a delimiter, not a RegExp. I've also explained how my solution works, which offers a **learning** opportunity (did you know about `Pattern.LITERAL`?). Your 'what if OP is forced to use a RegExp in the future?` is a DIFFERENT question, althrough similar. – Silviu Burcea Jul 10 '20 at 07:25
  • @LuisColorado and BTW, I know why his double-escaped command-line arg doesn't work, but he didn't ask why it doesn't work. He asked how to split on delimiter, which is exactly what my answer addresses, that's why I haven't covered an explanation for, in my opinion, not the right approach. You've done a good job explaining that part and I'm sure OP would upvote and accept your answer if he wanted the explanation. I've added a 'thanks' on your answer because I believe it brings value to the question. – Silviu Burcea Jul 10 '20 at 07:30
  • I'm not going to start a flame war, you want to be right... then you're right. The question says that. But just below, the OP says that **he is already getting it, with a Java string literal, what he doesn't get is when he uses a command line argument.** This has nothing to do with using `Pattern.LITERAL` (which, BTW, I know perfectly) but `Pattern.LITERAL` just converts your matching pattern into a plain string search, and that's completely out of scope here. Sorry for having been so clear. Next time I'll downvote your answer, instead of making a helping comment. – Luis Colorado Jul 11 '20 at 12:16