0

I'm trying to do regex for splitting string when tab is spotted.

I used this :

String line = scan.nextLine(); String Splitted[] = line.split("\t");

but it doesn't work so currently I'm using (which is working for me) :

String line = scan.nextLine(); String Splitted[] = line.split("\\s\\s\\s\\s");

Do you guys have idea why I can't use the "\t" regex?

  • 4
    you should use \\t instead of \t additionally you can use \\t+ – Mustofa Rizwan Oct 05 '17 at 09:40
  • 1
    Then your string has no tab symbol in it. There is no need using `"\\t"`, a `"\t"` is a valid regex pattern matching a single literal tab. `"\\s\\s\\s\\s"` works most probably because there are 4 consecutive spaces inside the input lines. – Wiktor Stribiżew Oct 05 '17 at 09:40
  • 1
    `"abc\tdef".split("\t")` works perfectly well - there probably is no tab (`\t` ) inside your string – user85421 Oct 05 '17 at 09:50
  • you can use instead of `\t` use `\\s+` – Benjamin Oct 05 '17 at 09:51
  • 1
    Your regexp should work, so the problem is probably your input. Tabs should be printed as `9` in this loop. Are they? `for (int i = 0; i < line.length(); i++) { System.out.println((int) line.charAt(i)); }` –  Oct 05 '17 at 10:05
  • [Read this QA](https://stackoverflow.com/q/3762347/4101906), and i suggest to use `\\s{4,8}` instead of `\t` or `\\t` or `\u0009` because in some text editors tabs will be replaced with 4,6 or 8 spaces automatically. – Rahmat Waisi Oct 09 '17 at 20:14

1 Answers1

1

Yes, \t is a valid Regex, but in Java string literals, a backslash has a special meaning, so to get the Regex symbol \t you'll have to use \\t. But since you are processing user input, you never know what this "tab" really consists of (could be a tab symbol or 4 spaces). So maybe you should just split at (\\t|\\s{2,}) - beware, this is a Java string literal. Hence the double backslash.

EDIT: In my above answer i suspect you don't want to split at single whitespaces too, is that right? In case you do want to split at single whitespaces, you could really just use \\s+ instead.

bkis
  • 2,530
  • 1
  • 17
  • 31
  • *EDIT: I suspect you don't want to split at single whitespaces too, is that right? In case you do, you could really just use \\s+ instead.* So you think \\s+ wont detect single whitespace ? – Mustofa Rizwan Oct 05 '17 at 10:35
  • @RizwanM.Tuman you got it wrong. I added bold fonts to make clear what i mean. Anyway, thanks for the hint - my words were a little misleading. – bkis Oct 05 '17 at 10:41
  • My answer explains the possible problems of this case and gives a working solution. Why is it being downvoted? @D.Drozd If this solved your problem and could provide help for other users having the same issue, i'd be glad if you marked this answer accepted. – bkis Oct 05 '17 at 14:04