2

The regular expression which you gave: ^(?:\b\w+\b[\s\r\n]*){1,250}$ to limit 250 words over multiple lines works if it doesn't have any special characters.

What should I do if I need to search for number of words which also consists special characters? Something like this an example:

--> Hi! i need help with regular expression, please help me. <--
user812786
  • 4,302
  • 5
  • 38
  • 50
  • 1
    exact duplicate of http://stackoverflow.com/questions/557695/limit-the-number-of-words-in-a-response-with-a-regular-expression ? – Hamish Smith Jul 30 '09 at 22:40

4 Answers4

6

The simplest approach is to group the word characters, and limit those groups to a specific range (1-250):

^\W*(\w+(\W+|$)){1,250}$
Justin Ludwig
  • 3,311
  • 2
  • 24
  • 17
3

I am not familiar with C# so I will describe the regex.

Method 1:

You are basically looking for this:

(\b[^\s]+\b){1,250}

In java:

\s is any whitespace character.

[^\s]+ is a sequence of non-whitespace characters.

\b is a word boundary.

You can translate the regex to C#.

Method 2:

Tokenize the input text into whitespace delimited words. In java, this is done by:

String[] tokens = inputString.split("\\s+");

where the regex is \s+

Now you can count the length of the array and implement your logic to reject the words beyond 250.

Method 3:

Define a pattern to capture whitespace as a 'capturing group'.

(\s+)

Now you can do a count the number of matches in your pattern matcher using a while loop. This is essentially kinda same as Method 2 but without involving the creation of the array of tokens.

hashable
  • 3,791
  • 2
  • 23
  • 22
1

A bit late to answer but none of the solutions here worked:

^([a-zA-Z0-9]+[^a-zA-Z0-9]*){1,8}$

where {1,8} defines how many wordt you want

MichaelD
  • 8,377
  • 10
  • 42
  • 47
  • Can you be more specific as to why Justen Ludwig's didn't work? I tried in a jsfiddle and works well for me, but if you have a test case that fails it would be very helpful. http://jsfiddle.net/7PKW7/ – hofnarwillie Feb 05 '14 at 15:05
  • This is the only answer that worked for me too. I'm using it in Umbraco. And just in case someone wants it, here's the modified version which includes symbols: ^([^\s]*[\s]*){1,8}$ – Owen Nov 11 '14 at 15:50
0

You can use the {a,b} quantifiers on any expression, like so:

.{1,256}
[\d\w_?]{1,567}
(0x)?[0-9A-F]{1,}

So, in your case, you could use:

^(?:\b\w+\b[_!?\s\r\n]*){1,250}$

Where the _!? can be any special characters.

Lucas Jones
  • 19,767
  • 8
  • 75
  • 88