2

I need to check line of string using Regex.

I can verify line content but I am not able to check if line length is some exact number.

Here is the example :-

system WAS SP2 NA5 na10-xyz cpu,core-2,idle 1360294220 0

For above line, I need to see if there are 8 string separated by single white space.

I have designed regex for checking above string.

\w+(\s\w+){3}(\s(\w|\W)+){2}\s\d{10}\s\d+

so what do I need to add to check string length?

Here is what I have looked so far.

Regex Java Total String Length

Regex to match words of a certain length

EDIT :-

I don't want to use split(" ") function as i will be checking more than 1 million line with regex . So if i can merger length checking with regex , it would be better in my opinion.

EDIT 2 :-

With my current regex expression , if there are more than 8 elements it will not throw error. like this example :-

system WAS aa aa SP2 NA5 na10-xyz cpu,core-2,idle 1360294220 0

here aa aa are extra strings which i added and it doesn't throw error with current regex expression.

Community
  • 1
  • 1
mihir6692
  • 177
  • 1
  • 4
  • 19
  • 3
    What do you mean by “string length”? The number of characters or the number of white-space separated tokens? Honestly, I wouldn't use a regex to test either. – 5gon12eder Nov 30 '15 at 06:54
  • I know i can use `string.split(" ");` . But if i can do it with regex it would be better. Any reason or benefit for not using regex to check string length.??? @5gon12eder – mihir6692 Nov 30 '15 at 06:57
  • @mihir6692 readability. Splitting on spaces and then counting the non empty Strings in the splitted array is far more readable than a regexp and for maintenance purposes – neomega Nov 30 '15 at 06:58
  • @neomega but i have like millions of lines so i thought i would add length checking in regex too. Don't know if its right approach. I have also edited my question to avoid confusion – mihir6692 Nov 30 '15 at 07:00
  • @mihir6692 don't try to optimize beforehand. Write something simple that works and if it's slow, then optimize it. I'm doing things similar to this and it takes like 30 seconds to handle 600 million lines with file writing. – neomega Nov 30 '15 at 07:04
  • What you seem to be looking for is someone to say this: "Regex is not the tool to use for counting things. Ever." Regex is for string matching not String parsing. – JamesENL Nov 30 '15 at 07:05
  • @neomega ok. i will check string length separately. then i will apply regex if string length criteria satisfy. – mihir6692 Nov 30 '15 at 07:06
  • @JamesENL Thanks ... i got it.... :) – mihir6692 Nov 30 '15 at 07:07
  • Just replace `(\s(\w|\W)+){2}` with `(\s\S+){2}`. Using regex is perfectly valid for a task to validate a count of specific sequences. You do not have to count the number of symbols matched with `\w+` and output that number, that would be impossible with regex. – Wiktor Stribiżew Nov 30 '15 at 07:51

1 Answers1

0

It's really simple, if whitespace at end doesn't matter this regex will match:

^([^ ]+ ){8}$

And if it does:

^([^ ]+ ){7}[^ ]+$

([^ ]) = Anything but a whitespace

([^ ]+) = a word that does not contain any whitespaces

([^ ]+ ) = a word followed by a whitespace

([^ ]+ ){7} = 7 words each followed by a whitespace

Mohammad Jafar Mashhadi
  • 4,102
  • 3
  • 29
  • 49