-6

I have a string like this :

"red blue green Dark Grey purple"

and i want a function that outputs:

"red, blue, green, Dark Grey, purple"

i want words with uppercase first letter that are next to each other to be as one tag, the same thing where there are 3, 4, 5... words one next to the other that have uppercase.

another example:

"lemon orange apple Delicious Black Berry" ==> "lemon, orange, apple, Delicious Black Berry"

1 Answers1

2

Yeah it's a bad question, but it seemed like a good challenge to me. You can use this Regular Expression.

/(([A-Z][a-z]* ?)+?(?= |$))|([a-z]+)/

And the PHP Code:

$string = 'red blue green Dark Grey purple';
preg_match_all('/(([A-Z][a-z]* ?)+?(?= |$))|([a-z]+)/', $string, $ans);

print_r($ans[0]); /*  Array
                   *  (
                   *    [0] => red
                   *    [1] => blue
                   *    [2] => green
                   *    [3] => Dark Grey
                   *    [4] => purple
                   *  )                     
                   */

$string = 'lemon orange apple Delicious Black Berry';
preg_match_all('/(([A-Z][a-z]* ?)+?(?= |$))|([a-z]+)/', $string, $ans);

print_r($ans[0]); /*  Array
                   *  (
                   *    [0] => lemon
                   *    [1] => orange
                   *    [2] => apple
                   *    [3] => Delicious Black Berry
                   *  ) 
                   */

Explanation of Regular Expression. This is split up into two parts. One part looking for words that start with a capital letter (([A-Z][a-z]* ?)+?(?= |$)) and another part looking for lowercase words ([a-z]+). So I will explain it in two parts starting with the capital letter portion.

  • (([A-Z][a-z]* ?)+?(?= |$)) Explanation
  • ([A-Z][a-z]* ?)+? - ([matches a capital letter] [followed by 0 or more lowercase letters] (followed by either a space or nothing) at least once but the least amount possible to match the pattern
  • (?= |$) - (doesn't match but makes sure that the next character is a space or the end of the string)

If there are no matches, it will then try this:

  • ([a-z]+) Explanation
  • ([a-z]+) - [matches at least 1 lowercase letter as many times as possible]

Enjoy. :)

Aust
  • 11,552
  • 13
  • 44
  • 74
  • Thanks, it works. Now i'd like to output a string with this values, separated by commas. How can i do? – gdgsgshshffhs Sep 26 '12 at 17:38
  • You can use [`implode`](http://php.net/manual/en/function.implode.php) Use it like this: `$ans = implode(',',$ans[0]);` – Aust Sep 26 '12 at 17:40
  • I found an error, if i use characters like "ã, ç, á..." it doesn't work. – gdgsgshshffhs Sep 26 '12 at 17:49
  • @CainãMaturo - Haha well it really depends on what you're trying to do. If you know that only a certain set of characters will be in the string, just add those characters in their correct places. i.e. change all `[a-z]` to `[a-zãçá]` and if you have any capital letters that are unique, stick those in the `[A-Z]` portions. If you don't know all of the characters, [then maybe this page will help you.](http://stackoverflow.com/questions/150033/regular-expression-to-match-non-english-characters) – Aust Sep 26 '12 at 18:05