-2

Let's say I have a string like A10-53 c and I want to get the character type of each character in an output array.

Something like:

Array
(
    [0] => string
    [1] => number
    [2] => number
    [3] => symbol
    [4] => number
    [5] => number
    [6] => space
    [7] => string
);

I could make something like:

$str = 'A10:53 c';
$arr = str_split($str);

$str_length = count($arr);
$res = array();
for($i=0; $i<$str_length; $i++){
    if(ctype_space($arr[$i])){
        $res[$i] = 'space';
    }elseif(is_numeric($arr[$i])){
        $res[$i] = 'number';
    }elseif(is_string($arr[$i])){
        $regex = preg_match('/[^a-zA-Z\d]/', $arr[$i]);
        if($regex){
            $res[$i] = 'symbol';
        }else{
            $res[$i] = 'string';
        }
    }else{
        $res[$i] = "N/A";
    }
}
print_r($res);

Is there something better than this method?

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Saroj Shrestha
  • 2,696
  • 4
  • 21
  • 45

1 Answers1

1

You actually don't need regex for this. The ctype_ functions will do the job for single-byte characters and it will be very easy to read (I cannot promise the same of my regex snippet).

Code: (Demo) (an implementation using your sample input)

$tests = ['A', 'z', '+', '0', '8', '*', ' '];

foreach ($tests as $test) {
    echo "\n{$test}: ";
    if (ctype_alpha($test)) {
        echo 'letter';
    } elseif (ctype_digit($test)) {
        echo 'number';
    } elseif (ctype_space($test)) {
        echo 'space';
    } else {
        echo 'symbol';
    }
}

And here is a demonstration of a regex that accommodates multibyte characters as well.

It is a minor adjustment to another answer of mine.

To convert your input string to an array, just call str_split() (or mb_str_split())

Code: (Demo)

$lookup = ['symbol', 'letter', 'number', 'space'];
$tests = ['A', 'z', '+', '0', 'ǻ', 'Ͱ', ' ', '₉'];

foreach ($tests as $test) {
    $index = preg_match('~(\pL)|(\pN)|(\pZ)~u', $test, $out) ? array_key_last($out) : 0;
    echo "{$test}: {$lookup[$index]}\n";
}

Output

A: letter
z: letter
+: symbol
0: number
ǻ: letter
Ͱ: letter
 : space
₉: number

p.s. If you are not entertaining multibyte characters, then this pattern will do: ~([a-z])|(\d)|( )~i

mickmackusa
  • 43,625
  • 12
  • 83
  • 136