2

I'm trying to make a regex in Java that does not match excessive whitespaces between words, and whitespaces at the beginning and end.

I have done this regex so far

^[\\S].*[\\S]$

That fail when there are whitespaces at the beginning and end of a line

But how about excessive whitespaces between words?

I want this line to fail:

"test    test"

But not this:

"test test"

I tried this:

^[\\S].*(?![\\s]{2,}).*[\\S]$

But it didn't work.

Jonas P.
  • 345
  • 1
  • 3
  • 10
  • 3
    Why don't you just trim the string before sending it to your method? – Luiggi Mendoza Jun 09 '15 at 16:37
  • Do you want excessive whitepsace strings to fail, or do you want to trim the whitespace within lines so that it does not fail? – woemler Jun 09 '15 at 17:41
  • @willOEM The regex should fail when the String starts with whitespace/ends with whitespace/has 2 or more whitespaces between words. – Jonas P. Jun 09 '15 at 18:01

4 Answers4

0

For failure case just check for:

\s{2,}

i.e. a whitespace 2 or more times. If there is a match then fail the verification.

anubhava
  • 761,203
  • 64
  • 569
  • 643
0

This should check before and after and more than two spaces

[^\s]([ ]{2,})[^\s]
guy_sensei
  • 513
  • 1
  • 6
  • 21
0

After reading this answer https://stackoverflow.com/a/1240365/2136936

I came up with the following regex, which is exactly what I wanted:

^[\\S](?!.*\\s{2,}).*[\\S]$

One thing I don't understand though, is why it doesn't work this way: ^[\\S].*(?!\\s{2,}).*[\\S]$

Community
  • 1
  • 1
Jonas P.
  • 345
  • 1
  • 3
  • 10
  • Your second alternative does not work because it requires that at *at least one position* in the string there are not two whitespaces (`.*` outside of the negation), while the first one requires that nowhere are two whitespaces. – mihi Jun 11 '15 at 17:28
0

You can use the following regex for it (assuming you do not want to match the empty string):

"^\\S++(\\s\\S++)*+$"

First a nonzero amount of non-whitespaces, and then multiple (possible zero) repetitions of a single whitespace followed by multiple (non-zero) non-whitespaces.

Instead of the non-backtracking ++ and *+ operators you can also use (with same result) the normal + and * operators; note that the performance of the non-backtracking operators will be a lot better in case the string is long (several kilobytes).

mihi
  • 6,507
  • 1
  • 38
  • 48