4

I managed to implement a function that converts camel case to words, by using the solution suggested by @ridgerunner in this question:

Split camelCase word into words with php preg_match (Regular Expression)

However, I want to also handle embedded abreviations like this:

'hasABREVIATIONEmbedded' translates to 'Has ABREVIATION Embedded'

I came up with this solution:

    <?php 

    function camelCaseToWords($camelCaseStr)
    {

        // Convert: "TestASAPTestMore" to "TestASAP TestMore"

        $abreviationsPattern = '/' . // Match position between UPPERCASE "words"
            '(?<=[A-Z])' . // Position is after group of uppercase,
            '(?=[A-Z][a-z])' . // and before group of lowercase letters, except the last upper case letter in the group.
            '/x';
        $arr = preg_split($abreviationsPattern, $camelCaseStr);
        $str = implode(' ', $arr);

        // Convert "TestASAP TestMore" to "Test ASAP Test More"
        $camelCasePattern = '/' . // Match position between camelCase "words".
            '(?<=[a-z])' . // Position is after a lowercase,
            '(?=[A-Z])' . // and before an uppercase letter.
            '/x';

        $arr = preg_split($camelCasePattern, $str);
        $str = implode(' ', $arr);

        $str = ucfirst(trim($str));
        return $str;
    }

    $inputs = array(
    'oneTwoThreeFour',
    'StartsWithCap',
    'hasConsecutiveCAPS',
    'ALLCAPS',
    'ALL_CAPS_AND_UNDERSCORES',
    'hasABREVIATIONEmbedded',
    );

    echo "INPUT";

    foreach($inputs as $val) {
        echo "'" . $val . "' translates to '" . camelCaseToWords($val). "'\n";
    }

The output is:

    INPUT'oneTwoThreeFour' translates to 'One Two Three Four'
    'StartsWithCap' translates to 'Starts With Cap'
    'hasConsecutiveCAPS' translates to 'Has Consecutive CAPS'
    'ALLCAPS' translates to 'ALLCAPS'
    'ALL_CAPS_AND_UNDERSCORES' translates to 'ALL_CAPS_AND_UNDERSCORES'
    'hasABREVIATIONEmbedded' translates to 'Has ABREVIATION Embedded'

It works as intended.

My question is: Can I combine the 2 regular expressions $abreviationsPattern and camelCasePattern so i can avoid running the preg_split() function twice?

Community
  • 1
  • 1
stou
  • 63
  • 6
  • Why do you use the `x` modifier when you then avoid all whitespace by concatenation and using PHP comments instead of regex comments? The hole point of `x` is that you can pass in you expression as one multi-line string, with `#...` comments. – Martin Ender Jun 28 '13 at 11:43
  • I was not aware of the meaning of \x. I copied the pattern from source mentioned and expanded from there. Thanks for you info. – stou Jun 28 '13 at 13:18

1 Answers1

1

These are always fun puzzles to solve; I've narrowed down the cases to just two:

  1. Detect words that start with a capital followed by lowercase letters (but not preceded by a word boundary or start of the subject) - (?<!\b)[A-Z][a-z]+

  2. Detect transitions from lowercase to uppercase - (?<=[a-z])[A-Z]

    function camelFix($str)
    {
        return preg_replace_callback('/(?<!\b)[A-Z][a-z]+|(?<=[a-z])[A-Z]/', function($match) {
            return ' '. $match[0];
        }, $str);
    }
    

It works for the inputs you have given; it might fail cases that I have not foreseen :)

Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
  • Thanks. Works like a charm. I knew there was at better solution than what I could come up with :-) Had to use create_function() to allow running on php5.2, and wrapped in a call to ucfirst() – stou Jun 28 '13 at 13:13