3

Let say I have the following string:

getPasswordLastChangedDatetime

How would I be able to split that up by capital letters so that I would be able to get:

get
Password
Last
Changed
Datetime
ryanzec
  • 27,284
  • 38
  • 112
  • 169

7 Answers7

6

If you only care about ASCII characters:

$parts = preg_split("/(?=[A-Z])/", $str);

DEMO

The (?= ..) construct is called lookahead [docs].

This works if the parts only contain a capital character at the beginning. It gets more complicated if you have things like getHTMLString. This could be matched by:

$parts = preg_split("/((?<=[a-z])(?=[A-Z])|(?=[A-Z][a-z]))/", $str);

DEMO

Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
1

Asked this a little too soon, found this:

preg_replace('/(?!^)[[:upper:]]/',' \0',$test);
ryanzec
  • 27,284
  • 38
  • 112
  • 169
  • 2
    I would add a **+**: `preg_replace('/(?!^)[[:upper:]]+/',' \0',$test);` to be able to get a good match on "getMyAPI". (That is, if you want the API as one word). – johnhaggkvist Jul 04 '11 at 15:00
1

For instance:

(?:^|\p{Lu})\P{Lu}*
Artefacto
  • 96,375
  • 17
  • 202
  • 225
  • More information can be found here: http://www.regular-expressions.info/unicode.html – Felix Kling Jul 04 '11 at 14:56
  • What do you propose to do with this regex? If you use `preg_split` or `preg_replace`, the characters it matches will be deleted; if you use `preg_match_all`, everything *except* those characters will be deleted. – Alan Moore Jul 04 '11 at 15:58
  • @Alan ? `preg_match_all` deletes nothing, it finds matches. You could e.g. do `preg_match_all('/(?:^|\p{Lu})\P{Lu}*/iu', 'getPasswordLastChangedDatetime', $result)`; then `$result[0]` will have the strings the OP wants. – Artefacto Jul 04 '11 at 16:02
  • You're right, I was reading the regex wrong. And I should have said it would *ignore* the other characters, leaving them out of the results (but of course, it doesn't ignore anything). But now I'm curious about that `i` modifier: it doesn't seem to have any effect in my tests--which is good, since case is the whole point. – Alan Moore Jul 04 '11 at 16:33
  • @Alan I put the `i` accidentally (force of habit). In any case, since the expression is looking up the properties of the characters directly, it has no effect. – Artefacto Jul 04 '11 at 16:36
0

No need to over complicated solution. This does it

preg_replace('/([A-Z])/',"\n".'$1',$string);

This doens't take care of acronyms of course

dynamic
  • 46,985
  • 55
  • 154
  • 231
0
preg_split('@(?=[A-Z])@', 'asAs')
azat
  • 3,545
  • 1
  • 28
  • 30
  • 2
    FYI, there are no quantifiers and no dots in this regex, so the `U` (ungreedy quantifiers) and `s` (dot matches all) modifiers aren't doing any good. – Alan Moore Jul 04 '11 at 15:47
0

Use this: [a-z]+|[A-Z][a-z]* or \p{Ll}+|\p{Lu}\p{Ll}*

Kirill Polishchuk
  • 54,804
  • 11
  • 122
  • 125
0
 preg_split("/(?<=[a-z])(?=[A-Z])/",$password));
Bob Vale
  • 18,094
  • 1
  • 42
  • 49