30

Using PHP, I'd like to convert a string containing a Roman number into its integer representation. I need this because I need to make calculations on them.

Wikipedia on Roman numerals

It would suffice to only recognize the basic Roman numeral characters, like:

$roman_values=array(
    'I' => 1,
    'V' => 5,
    'X' => 10,
    'L' => 50,
    'C' => 100,
    'D' => 500,
    'M' => 1000,
);

That means the highest possible number is 3999 (MMMCMXCIX). I will use N to represent zero, other than that only positive integers are supported.

I cannot use the PEAR library for Roman numbers.

I found this great question on SO on how to test whether the string contains a valid Roman numeral:

How do you match only valid roman numerals with a regular expression?

What would be the best way of coding this?

Community
  • 1
  • 1
kapa
  • 77,694
  • 21
  • 158
  • 175
  • 1
    Why can't you use the PEAR library? Surely you could at least look at the code? It's under the same license as PHP. – Andrew Aylett Jun 07 '11 at 13:11
  • Because pear is not wide-available, as example can not be installed in php command line environment. And is not allowed by security reasons :) – publikz.com Jun 07 '11 at 13:14
  • @stereofrog The PEAR Package Manager is not installed on the server and I don't have rights to install it. And to be honest, it is not really worth for this one simple task. – kapa Jun 07 '11 at 13:17

14 Answers14

46

How about this:

$romans = array(
    'M' => 1000,
    'CM' => 900,
    'D' => 500,
    'CD' => 400,
    'C' => 100,
    'XC' => 90,
    'L' => 50,
    'XL' => 40,
    'X' => 10,
    'IX' => 9,
    'V' => 5,
    'IV' => 4,
    'I' => 1,
);

$roman = 'MMMCMXCIX';
$result = 0;

foreach ($romans as $key => $value) {
    while (strpos($roman, $key) === 0) {
        $result += $value;
        $roman = substr($roman, strlen($key));
    }
}
echo $result;

which should output 3999 for the supplied $roman. It seems to work for my limited testing:

MCMXC = 1990
MM = 2000
MMXI = 2011
MCMLXXV = 1975

You might want to do some validation first as well :-)

andyb
  • 43,435
  • 12
  • 121
  • 150
  • 1
    I like how short your solution is, but you need to add a few items to `$romans`, since, for example, **MIM** and **MDCCCCLXXXXVIIII** both could represent **1999** (because there's not a consensus on what constitutes a *valid* Roman number). – akTed Feb 03 '13 at 06:43
  • @andyb I could really use this snippet in a project that I'm going to release under the MIT license. Is there any chance you'd license your answer under the MIT? I dislike viral licenses, and don't want use the SE default cc by-sa 3.0. – Schlaus Sep 03 '14 at 23:14
  • 1
    Happy to release under MIT. What's the simplest/best way to do this? – andyb Sep 11 '14 at 19:12
  • 2
    @akTed Actually no, they can't, because: **1.** The tens characters ( _I, X, C, and M_ ) can be repeated up to three times. At 4'th, you need to subtract from the next highest fives character. So "MDCCCCLXXXXVIIII" is invalid number. ( _CCCC should be replaced with CD_ ). **2.** Greater values should not be followed by lower values, so "MIM" is also invalid. 1999 is written as "MCMXCIX". – Alexandru Guzinschi Mar 29 '15 at 18:33
  • @Alexandru Guzinschi As mentioned in my comment, there is no consensus on what is valid. XXXX is valid for 40. There is no single "correct" way to write Roman numerals. – akTed Mar 29 '15 at 22:35
  • That is partial valid for ancient text, when no standard was in place, although some rules exists since then (_ex: only power of ten can be repeated_). In modern times there are a few rules in place for dealing with Roman numerals and what I said in my previous comment is part of those rules. Although I learned them in 9th grade, I can still [remember](http://home.hiwaay.net/~lkseitz/math/roman/numerals.shtml) (_most of them_). – Alexandru Guzinschi Mar 30 '15 at 05:16
  • @HenrikPetterson which version of PHP are you using? The above code has specific logic to deal with `IV` and I just tested it locally and it still works for me – andyb Jul 19 '18 at 13:21
10

I am not sure whether you've got ZF or not, but in case you (or any of you who's reading this) do here is my snippet:

$number = new Zend_Measure_Number('MCMLXXV', Zend_Measure_Number::ROMAN);
$number->convertTo (Zend_Measure_Number::DECIMAL);
echo $number->getValue();
kapa
  • 77,694
  • 21
  • 158
  • 175
akond
  • 15,865
  • 4
  • 35
  • 55
  • Zend 2 changed, see [NumberFormat](https://framework.zend.com/manual/2.4/en/modules/zend.i18n.filter.number.format.html) – Peter Krauss Nov 19 '16 at 23:32
10

This is the one I came up with, I added the validity check as well.

class RomanNumber {
    //array of roman values
    public static $roman_values=array(
        'I' => 1, 'V' => 5, 
        'X' => 10, 'L' => 50,
        'C' => 100, 'D' => 500,
        'M' => 1000,
    );
    //values that should evaluate as 0
    public static $roman_zero=array('N', 'nulla');
    //Regex - checking for valid Roman numerals
    public static $roman_regex='/^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$/';

    //Roman numeral validation function - is the string a valid Roman Number?
    static function IsRomanNumber($roman) {
         return preg_match(self::$roman_regex, $roman) > 0;
    }

    //Conversion: Roman Numeral to Integer
    static function Roman2Int ($roman) {
        //checking for zero values
        if (in_array($roman, self::$roman_zero)) {
            return 0;
        }
        //validating string
        if (!self::IsRomanNumber($roman)) {
            return false;
        }

        $values=self::$roman_values;
        $result = 0;
        //iterating through characters LTR
        for ($i = 0, $length = strlen($roman); $i < $length; $i++) {
            //getting value of current char
            $value = $values[$roman[$i]];
            //getting value of next char - null if there is no next char
            $nextvalue = !isset($roman[$i + 1]) ? null : $values[$roman[$i + 1]];
            //adding/subtracting value from result based on $nextvalue
            $result += (!is_null($nextvalue) && $nextvalue > $value) ? -$value : $value;
        }
        return $result;
    }
}
kapa
  • 77,694
  • 21
  • 158
  • 175
4

Quick idea - go through the Roman number from right to left, if value of $current (more to the left) is smaller than $previous, then subtract it from the result, if larger, then add it.

$romanValues=array(
    'I' => 1,
    'V' => 5,
    'X' => 10,
    'L' => 50,
    'C' => 100,
    'D' => 500,
    'M' => 1000,
);
$roman = 'MMMCMXCIX';

// RTL
$arabic = 0;
$prev = null;
for ( $n = strlen($roman) - 1; $n >= 0; --$n ) {
    $curr = $roman[$n];
    if ( is_null($prev) ) {
        $arabic += $romanValues[$roman[$n]];
    } else {
        $arabic += $romanValues[$prev] > $romanValues[$curr] ? -$romanValues[$curr] : +$romanValues[$curr];
    }
    $prev = $curr;
}
echo $arabic, "\n";

// LTR
$arabic = 0;
$romanLength = strlen($roman);
for ( $n = 0; $n < $romanLength; ++$n ) {
    if ( $n === $romanLength - 1 ) {
        $arabic += $romanValues[$roman[$n]];
    } else {
        $arabic += $romanValues[$roman[$n]] < $romanValues[$roman[$n+1]] ? -$romanValues[$roman[$n]] : +$romanValues[$roman[$n]];
    }
}
echo $arabic, "\n";

Some validation of roman number should also be added, though you said that you already have found how to do it.

binaryLV
  • 9,002
  • 2
  • 40
  • 42
  • Yes, in this case it does matter, as meaning of "current letter" depends on the value of "next letter" - if next letter is smaller or the same as current, then add current to the result, if next is larger, then subtract current from the result. If we go RTL, we store "next letter" in `$prev` variable, so it is always accessible with exception of first (right-most) letter where basic `is_null($prev)` check is sufficient. If we go LTR, we have to check value of next letter as well as existance of next letter. – binaryLV Jun 09 '11 at 09:28
  • Keep in mind though that this might work also for *invalid* roman letters, e.g., `IVL` will be treated as `-1-5+50` and result in `44`, which should be written as `XLIV`. Therefore, validation of number's structure should be added, as noted in answer. – binaryLV Jun 09 '11 at 09:32
  • @binaryLV What do you mean by `If we go LTR, we have to check value of next letter as well as existance of next letter`. You're using RTL, but you still run a check on the value of the next letter (ternary). What else would be necessary LTR? – kapa Jun 09 '11 at 09:39
  • That check just has to be done differently. In RTL, you can use value from previous loop iteration, as "current" from *current* iteration will be "previous" in *next* iteration. In LTR, in every iteration you have get value which will be "current" in *next* iteration, as it is not stored anywhere yet. I've updated answer with LTR version of this code. – binaryLV Jun 09 '11 at 09:52
  • @binaryLV Hm, yes, I see. You could save the Roman value as well then, one less array lookup. Nice solution. – kapa Jun 09 '11 at 10:04
3

Copyrights is for this blog (btw!) http://scriptsense.blogspot.com/2010/03/php-function-number-to-roman-and-roman.html

<?php

function roman2number($roman){
    $conv = array(
        array("letter" => 'I', "number" => 1),
        array("letter" => 'V', "number" => 5),
        array("letter" => 'X', "number" => 10),
        array("letter" => 'L', "number" => 50),
        array("letter" => 'C', "number" => 100),
        array("letter" => 'D', "number" => 500),
        array("letter" => 'M', "number" => 1000),
        array("letter" => 0, "number" => 0)
    );
    $arabic = 0;
    $state = 0;
    $sidx = 0;
    $len = strlen($roman);

    while ($len >= 0) {
        $i = 0;
        $sidx = $len;

        while ($conv[$i]['number'] > 0) {
            if (strtoupper(@$roman[$sidx]) == $conv[$i]['letter']) {
                if ($state > $conv[$i]['number']) {
                    $arabic -= $conv[$i]['number'];
                } else {
                    $arabic += $conv[$i]['number'];
                    $state = $conv[$i]['number'];
                }
            }
            $i++;
        }

        $len--;
    }

    return($arabic);
}


function number2roman($num,$isUpper=true) {
    $n = intval($num);
    $res = '';

    /*** roman_numerals array ***/
    $roman_numerals = array(
        'M' => 1000,
        'CM' => 900,
        'D' => 500,
        'CD' => 400,
        'C' => 100,
        'XC' => 90,
        'L' => 50,
        'XL' => 40,
        'X' => 10,
        'IX' => 9,
        'V' => 5,
        'IV' => 4,
        'I' => 1
    );

    foreach ($roman_numerals as $roman => $number)
    {
        /*** divide to get matches ***/
        $matches = intval($n / $number);

        /*** assign the roman char * $matches ***/
        $res .= str_repeat($roman, $matches);

        /*** substract from the number ***/
        $n = $n % $number;
    }

    /*** return the res ***/
    if($isUpper) return $res;
    else return strtolower($res);
}

/* TEST */
echo $s=number2roman(1965,true);
echo "\n and bacK:\n";
echo roman2number($s);


?>
kapa
  • 77,694
  • 21
  • 158
  • 175
publikz.com
  • 931
  • 1
  • 12
  • 22
  • 1
    Without spending too much time trying to grok the algorithm, it appears flawed - it's valid to write 800 as CCM (though generally considered bad style) as well as DCCC, the method should be that any digit followed by a digit of higher numerical value should be substracted from the latter instead of added. – symcbean Jun 07 '11 at 13:23
2

I'm late to the party, but here's mine. Assumes valid Numerals in the string, but doesn't test for a valid Roman number, whatever that is...there doesn't seem to be a consensus. This function will work for Roman numbers like VC (95), or MIM (1999), or MMMMMM (6000).

function roman2dec( $roman ) {
    $numbers = array(
        'I' => 1,
        'V' => 5,
        'X' => 10,
        'L' => 50,
        'C' => 100,
        'D' => 500,
        'M' => 1000,
    );

    $roman = strtoupper( $roman );
    $length = strlen( $roman );
    $counter = 0;
    $dec = 0;
    while ( $counter < $length ) {
        if ( ( $counter + 1 < $length ) && ( $numbers[$roman[$counter]] < $numbers[$roman[$counter + 1]] ) ) {
            $dec += $numbers[$roman[$counter + 1]] - $numbers[$roman[$counter]];
            $counter += 2;
        } else {
            $dec += $numbers[$roman[$counter]];
            $counter++;
        }
    }
    return $dec;
}
akTed
  • 214
  • 2
  • 8
1

Whew! Those are quite a few answers, and made of them are code-heavy! How about we define an algorithm for this first, before I give an answer?

The Basics

  • Don't store multi-digit Roman numerals, like 'CM' => 900, or anything like that in an array. If you know that M - C (1000 - 100) equals 900, then ultimately, you should only be storing the values of 1000 and 100. You wouldn't have multi-digit Roman numerals like CMI for 901, would you? Any answer that does this will be inefficient from one that understands the Roman syntax.

The Algorithm

Example: LIX (59)

  • Do a for loop on the numbers, starting at the end of the string of Roman numerals. In our example: We start on "X".
  • Greater-Than-Equal-To Case — If the value we are looking at is the same or greater than the last value, simply add it to a cumulative result. In our example: $result += $numeral_values["X"].
  • Less-Than Case — If the value we are subtracting is less than the previous number, we subtract it from our cumulative result. In our example IX, I is 1 and X is 10, so, since 1 is less than 10, we subtract it: giving us 9.

The Demo

Full Working Demo Online

The Code

function RomanNumeralValues() {
    return [
        'I'=>1,
        'V'=>5,
        'X'=>10,
        'L'=>50,
        'C'=>100,
        'D'=>500,
        'M'=>1000,
    ];
}

function ConvertRomanNumeralToArabic($input_roman){
    $input_length = strlen($input_roman);
    if($input_length === 0) {
        return $result;
    }
    
    $roman_numerals = RomanNumeralValues();
    
    $current_pointer = 1;
    $result = 0;
    
    for($i = $input_length - 1; $i > -1; $i--){ 
        $letter = $input_roman[$i];
        $letter_value = $roman_numerals[$letter];
        
        if($letter_value === $current_pointer) {
            $result += $letter_value;
        } elseif ($letter_value < $current_pointer) {
            $result -= $letter_value;
        } else {
            $result += $letter_value;
            $current_pointer = $letter_value;
        }
    }
    
    return $result;
}

print ConvertRomanNumeralToArabic("LIX");
HoldOffHunger
  • 18,769
  • 10
  • 104
  • 133
  • A few problems here: 1) You're returning `$result` before defining it; 2) Roman numeral values are static, so it would make more sense to define them in the method (my opinion). – Nathanael McDaniel Jul 17 '23 at 15:07
1
function romanToInt($s) {
    $array = ["I"=>1,"V"=>5,"X"=>10,"L"=>50,"C"=>100,"D"=>500,"M"=>1000];
    $sum = 0;
    for ($i = 0; $i < strlen($s); $i++){
        $curr = $s[$i];
        $next = $s[$i+1];
        if ($array[$curr] < $array[$next]) {
            $sum += $array[$next] - $array[$curr];
            $i++;
        } else {
            $sum += $array[$curr];
        }
    }
    return $sum;
}
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 29 '22 at 10:50
0

Just stumbled across this beauty and have to post it all over:

function roman($N)
{
    $c = 'IVXLCDM';
    for ($a = 5, $b = $s = ''; $N; $b++, $a ^= 7)
    {
        for (
            $o = $N % $a, $N = $N / $a ^ 0;

            $o--;

            $s = $c[$o > 2 ? $b + $N - ($N &= -2) + $o = 1 : $b] . $s
        );
    }
    return $s;
}
Daniel
  • 72
  • 5
  • 1
    You should post a link where you have stumbled across this :). It seems to be fun, but not a great example of descriptive variable names. – kapa Feb 19 '13 at 12:06
  • Had to format it for readability, the original was in code, but I just searched for `php roman IVXLCDM` and actually found the [original on the PHP manual](http://php.net/manual/en/function.base-convert.php#105414) (that formatting is the same as on our code) a shout out to JR along with 100 internet points! – Daniel Feb 19 '13 at 13:18
  • This does the opposite of what the question asked, no longer works, and is convoluted code to boot. – miken32 May 31 '22 at 02:46
0
function Romannumeraltonumber($input_roman){
  $di=array('I'=>1,
            'V'=>5,
            'X'=>10,
            'L'=>50,
            'C'=>100,
            'D'=>500,
            'M'=>1000);
  $result=0;
  if($input_roman=='') return $result;
  //LTR
  for($i=0;$i<strlen($input_roman);$i++){ 
    $result=(($i+1)<strlen($input_roman) and 
          $di[$input_roman[$i]]<$di[$input_roman[$i+1]])?($result-$di[$input_roman[$i]]) 
                                                        :($result+$di[$input_roman[$i]]);
   }
 return $result;
}
scheruku
  • 11
  • 2
0
function rom_to_arabic($number) {

$symbols = array( 
    'M'  => 1000,  
    'D'  => 500, 
    'C'  => 100, 
    'L'  => 50, 
    'X'  => 10, 
    'V'  => 5, 
    'I'  => 1);

$a = str_split($number);

$i = 0;
$temp = 0;
$value = 0;
$q = count($a);
while($i < $q) {

    $thys = $symbols[$a[$i]];
    if(isset($a[$i +1])) {
        $next = $symbols[$a[$i +1]];
    } else {
        $next = 0;
    }

    if($thys < $next) {
        $value -= $thys;
    } else {
        $value += $thys;
    }

    $temp = $thys;
    $i++;
}

return $value;

}
Dariusz Majchrzak
  • 1,227
  • 2
  • 12
  • 22
0
function parseRomanNumerals($input)
{
$roman_val = '';
$roman_length = strlen($input);
$result_roman = 0;
for ($x = 0; $x <= $roman_length; $x++) {
$roman_val_prev = $roman_val;
$roman_numeral = substr($input, $roman_length-$x,1);

switch ($roman_numeral) {
case "M":
$roman_val = 1000;
break;
case "D":
$roman_val = 500;
break;
case "C":
$roman_val = 100;
break;
case "L":
$roman_val = 50;
break;
case "X":
$roman_val = 10;
break;
case "V":
$roman_val = 5;
break;
case "I":
$roman_val = 1;
break;
default:
$roman_val = 0;
}
if ($roman_val_prev<$roman_val) {
$result_roman = $result_roman - $roman_val;
}
else {
$result_roman = $result_roman + $roman_val;
}
}
return abs($result_roman);
}
Ioannis Kokkinis
  • 185
  • 1
  • 1
  • 17
0

Define your own schema! (optional)

function rom2arab($rom,$letters=array()){
    if(empty($letters)){
        $letters=array('M'=>1000,
                       'D'=>500,
                       'C'=>100,
                       'L'=>50,
                       'X'=>10,
                       'V'=>5,
                       'I'=>1);
    }else{
        arsort($letters);
    }
    $arab=0;
    foreach($letters as $L=>$V){
        while(strpos($rom,$L)!==false){
            $l=$rom[0];
            $rom=substr($rom,1);
            $m=$l==$L?1:-1;
            $arab += $letters[$l]*$m;
        }
    }
    return $arab;
}

Inspired by andyb's answer

Shad
  • 15,134
  • 2
  • 22
  • 34
0

I just wrote this in about 10 mins, it's not perfect, but seems to work for the few test cases I've given it. I'm not enforcing what values are allowed to be subtracted from what, this is just a basic loop that compares the current letter value with the next one in the sequence (if it exists) and then either adds the value or adds the subtracted amount to the total:

$roman = strtolower($_GET['roman']);

$values = array(
'i' => 1,
'v' => 5,
'x' => 10,
'l' => 50,
'c' => 100,
'd' => 500,
'm' => 1000,
);
$total = 0;
for($i=0; $i<strlen($roman); $i++)
{
    $v = $values[substr($roman, $i, 1)];
    $v2 = ($i < strlen($roman))?$values[substr($roman, $i+1, 1)]:0;

    if($v2 && $v < $v2)
    {
        $total += ($v2 - $v);
        $i++;
    }
    else
        $total += $v;

}

echo $total;
  • 1
    You should test your code with error reporting on (or suppress it as necessary), it throws a `Notice: Undefined offset: 0` on your `$v2...` line most executions. – akTed Feb 03 '13 at 07:31