13

How can I swap around / toggle the case of the characters in a string, for example:

$str = "Hello, My Name is Tom";

After I run the code I get a result like this:

$newstr = "hELLO, mY nAME Is tOM";

Is this even possible?

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
tarnfeld
  • 25,992
  • 41
  • 111
  • 146

9 Answers9

69

If your string is ASCII only, you can use XOR:

$str = "Hello, My Name is Tom";

print strtolower($str) ^ strtoupper($str) ^ $str;

Outputs:

hELLO, mY nAME IS tOM
Mike
  • 21,301
  • 2
  • 42
  • 65
  • 1
    Very cool. strtolower($str) ^ strtoupper($str) will return a string with 0x20 where the characters are alpha and 0 for any other character. Then xor with the original string uses 0x20 to flip the case, while the 0 chars leave the non-alpha characters unchanged. – xtempore Jul 22 '17 at 02:41
  • @Mike seems like this works with most of the UTF-8 characters when I use multibyte functions, can you confirm? – Dawid Zbiński Feb 28 '19 at 06:57
12

OK I know you've already got an answer, but the somewhat obscure strtr() function is crying out to be used for this ;)

$str = "Hello, My Name is Tom";
echo strtr($str, 
           'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz',
           'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ');
user272563
  • 534
  • 3
  • 4
  • If you want to deal with multi-byte UTF-8 characters, you'll need to use strtr($str, $substitutions_array). This is actually the means I use to strip accents from all letters of a UTF8 string. – user272563 Feb 14 '10 at 01:37
  • One clear advantage of this answer is that it is a non-regex-based, non-bitwise, single-function solution. Other techniques may be less "dev team" friendly. – mickmackusa May 07 '21 at 02:46
6

Very similar in function to the answer by Mark.

preg_replace_callback(
    '/[a-z]/i',
    function($matches) {
        return $matches[0] ^ ' ';
    },
    $str
)

Explanation by @xtempore:

'a' ^ ' ' returns A. It works because A is 0x41 and a is 0x61 (and likewise for all A-Z), and because a space is 0x20. By xor-ing you are flipping that one bit. In simple terms, you are adding 32 to upper case letters making them lower case and subtracting 32 from lower case letters making them upper case.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Leigh
  • 12,859
  • 3
  • 39
  • 60
  • How does this one work? `'a' ^ ' '` seems to return `0` for me. – Sukima Jun 28 '14 at 23:40
  • 'a' ^ ' ' returns 'A'. It works because 'A' is 0x41 and 'a' is 0x61 (and likewise for all A-Z), and because ' ' is 0x20. By xor-ing you are flipping that one bit. In simple terms, you are adding 32 to upper case letters making them lower case and subtracting 32 from lower case letters making them upper case. – xtempore Jul 22 '17 at 02:38
6

The quickest way is with a bitmask. No clunky string functions or regex. PHP is a wrapper for C, so we can manipulate bits quite easily if you know your logical function like OR, NOT, AND, XOR, NAND, etc..:

function swapCase($string) {
    for ($i = 0; $i < strlen($string); $i++) {
        $char = ord($string{$i});
        if (($char > 64 && $char < 91) || ($char > 96 && $char < 123)) {
            $string{$i} = chr($char ^ 32);
        }
    }
    return $string;
}

This is what changes it:

$string{$i} = chr($char ^ 32);

We take the Nth character in $string and perform an XOR (^) telling the interpreter to take the integer value of $char and swapping the 6th bit (32) from a 1 to 0 or 0 to 1.

All ASCII characters are 32 away from their counterparts (ASCII was an ingenious design because of this. Since 32 is a power of 2 (2^5), it's easy to shift bits. To get the ASCII value of a letter, use the built in PHP function ord():

ord('a') // 65
ord('A') // 97
// 97 - 65 = 32

So you loop through the string using strlen() as the middle part of the for loop, and it will loop exactly the number of times as your string has letters. If the character at position $i is a letter (a-z (65-90) or A-Z (97-122)), it will swap that character for the uppercase or lowercase counterpart using a bitmask.

Here's how the bitmask works:

0100 0001 // 65 (lowercase a)
0010 0000 // 32 (bitmask of 32)
--------- // XOR means: we put a 1 if the bits are different, a 0 if they are same.
0110 0001 // 97 (uppercase A)

We can reverse it:

0110 0001 // 97 (A)
0010 0000 // Bitmask of 32
---------
0100 0001 // 65 (a)

No need for str_replace or preg_replace, we just swap bits to add or subtract 32 from the ASCII value of the character and we swap cases. The 6th bit (6th from the right) determines if the character is uppercase or lowercase. If it's a 0, it's lowercase and 1 if uppercase. Changing the bit from a 0 to a 1 ads 32, getting the uppercase chr() value, and changing from a 1 to a 0 subtracts 32, turning an uppercase letter lowercase.

swapCase('userId'); // USERiD
swapCase('USERiD'); // userId
swapCase('rot13'); // ROT13

We can also have a function that swaps the case on a particular character:

// $i = position in string
function swapCaseAtChar($string, $i) {
    $char = ord($string{$i});
    if (($char > 64 && $char < 91) || ($char > 96 && $char < 123)) {
        $string{$i} = chr($char ^ 32);
        return $string;
    } else {
        return $string;
    }
}

echo swapCaseAtChar('iiiiiiii', 0); // Iiiiiiii
echo swapCaseAtChar('userid', 4); // userId

// Numbers are no issue
echo swapCaseAtChar('12345qqq', 7); // 12345qqQ
mwieczorek
  • 2,107
  • 6
  • 31
  • 37
3

You'll need to iterate through the string testing the case of each character, calling strtolower() or strtoupper() as appropriate, adding the modified character to a new string.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
2

I know this question is old - but here's my 2 flavours of a multi-byte implementation.

Multi function version: (mb_str_split function found here):

function mb_str_split( $string ) { 
   # Split at all position not after the start: ^ 
   # and not before the end: $ 
   return preg_split('/(?<!^)(?!$)/u', $string ); 
}

function mb_is_upper($char) {
   return mb_strtolower($char, "UTF-8") != $char;
}

function mb_flip_case($string) {
   $characters = mb_str_split($string);
   foreach($characters as $key => $character) {
       if(mb_is_upper($character))
           $character = mb_strtolower($character, 'UTF-8');
       else
           $character = mb_strtoupper($character, 'UTF-8');

       $characters[$key] = $character;
   }
   return implode('',$characters);
}

Single function version:

function mb_flip_case($string) {
    $characters = preg_split('/(?<!^)(?!$)/u', $string );
    foreach($characters as $key => $character) {
        if(mb_strtolower($character, "UTF-8") != $character)
            $character = mb_strtolower($character, 'UTF-8');
        else
            $character = mb_strtoupper($character, 'UTF-8');

        $characters[$key] = $character;
    }
    return implode('',$characters);
}
Community
  • 1
  • 1
Larpon
  • 812
  • 6
  • 19
  • `preg_split()` has a `PREG_SPLIT_NO_EMPTY` flag available. Empty glue is the default value for `implode()` and doesn't need to be declared. – mickmackusa May 29 '21 at 00:28
2

Following script supports UTF-8 characters like "ą" etc.

  • PHP 7.1+

    $before = 'aaAAąAŚĆżź';
    $after = preg_replace_callback('/./u', function (array $char) {
        [$char] = $char;
    
        return $char === ($charLower = mb_strtolower($char))
        ? mb_strtoupper($char)
        : $charLower;
    }, $before);
    
  • PHP 7.4+

    $before = 'aaAAąAŚĆżź';
    $after = implode(array_map(function (string $char) {
        return $char === ($charLower = mb_strtolower($char))
        ? mb_strtoupper($char)
        : $charLower;
    }, mb_str_split($before)));
    

$before: aaAAąAŚĆżź

$after: AAaaĄaśćŻŹ

KsaR
  • 581
  • 6
  • 17
  • 1
    If the point of a regex technique is to make function-based replacements on a string, then `preg_match_all()` is less appropriate/direct versus `preg_replace_callback()`. – mickmackusa May 07 '21 at 02:43
  • @mickmackusa, Right, corrected. Thank you. – KsaR May 08 '21 at 02:38
1

I suppose a solution might be to use something like this :

$str = "Hello, My Name is Tom";
$newStr = '';
$length = strlen($str);
for ($i=0 ; $i<$length ; $i++) {
    if ($str[$i] >= 'A' && $str[$i] <= 'Z') {
        $newStr .= strtolower($str[$i]);
    } else if ($str[$i] >= 'a' && $str[$i] <= 'z') {
        $newStr .= strtoupper($str[$i]);
    } else {
        $newStr .= $str[$i];
    }
}
echo $newStr;

Which gets you :

hELLO, mY nAME IS tOM


i.e. you :

  • loop over each character of the original string
  • if it's between A and Z, you put it to lower case
  • if it's between a and z, you put it to upper case
  • else, you keep it as-is

The problem being this will probably not work nicely with special character like accents :-(


And here is a quick proposal that might (or might not) work for some other characters :

$str = "Hello, My Name is Tom";
$newStr = '';
$length = strlen($str);
for ($i=0 ; $i<$length ; $i++) {
    if (strtoupper($str[$i]) == $str[$i]) {
        // Putting to upper case doesn't change the character
        // => it's already in upper case => must be put to lower case
        $newStr .= strtolower($str[$i]);
    } else {
        // Putting to upper changes the character
        // => it's in lower case => must be transformed to upper case
        $newStr .= strtoupper($str[$i]);
    }
}
echo $newStr;

An idea, now, would be to use mb_strtolower and mb_strtoupper : it might help with special characters, and multi-byte encodings...

Pascal MARTIN
  • 395,085
  • 80
  • 655
  • 663
0

For a multibyte/unicode-safe solution, I'd probably recommend mutating/toggling the case of each letter based on which capture group contains a letter. This way you don't have to make a multibyte-base check after matching a letter with regex.

Code: (Demo)

$string = 'aaAAąAŚĆżź';
echo preg_replace_callback(
         '/(\p{Lu})|(\p{Ll})/u',
         function($m) {
             return $m[1]
                 ? mb_strtolower($m[1])
                 : mb_strtoupper($m[2]);
         },
         $string
     );
// AAaaĄaśćŻŹ

See this answer about how to match letters that might be multibyte.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136