4

I am teaching some neighborhood kids to program in Python. Our first project is to convert a string given as a Roman numeral to the Arabic value.

So we developed an function to evaluate a string that is a Roman numeral the function takes a string and creates a list that has the Arabic equivalents and the operations that would be done to evaluate to the Arabic equivalent.

For example suppose you fed in XI the function will return [1,'+',10] If you fed in IX the function will return [10,'-',1] Since we need to handle the cases where adjacent values are equal separately let us ignore the case where the supplied value is XII as that would return [1,'=',1,'+',10] and the case where the Roman is IIX as that would return [10,'-',1,'=',1]

Here is the function

def conversion(some_roman):
    roman_dict = {'I':1,'V':5,'X':10,'L':50,'C':100,'D':500,'M',1000}
    arabic_list = []
    for letter in some_roman.upper():
        if len(roman_list) == 0:
            arabic_list.append(roman_dict[letter]
            continue
        previous = roman_list[-1]
        current_arabic = roman_dict[letter]
        if current_arabic > previous:
            arabic_list.extend(['+',current_arabic])
            continue
        if current_arabic == previous:
            arabic_list.extend(['=',current_arabic])
            continue
        if current_arabic < previous:
            arabic_list.extend(['-',current_arabic])
      arabic_list.reverse()
      return arabic_list

the only way I can think to evaluate the result is to use eval()

something like

def evaluate(some_list):
    list_of_strings = [str(item) for item in some_list]
    converted_to_string = ''.join([list_of_strings])
    arabic_value = eval(converted_to_string)
    return arabic_value

I am a little bit nervous about this code because at some point I read that eval is dangerous to use in most circumstances as it allows someone to introduce mischief into your system. But I can't figure out another way to evaluate the list returned from the first function. So without having to write a more complex function.

The kids get the conversion function so even if it looks complicated they understand the process of roman numeral conversion and it makes sense. When we have talked about evaluation though I can see they get lost. Thus I am really hoping for some way to evaluate the results of the conversion function that doesn't require too much convoluted code.

Sorry if this is warped, I am so . . .

PyNEwbie
  • 4,882
  • 4
  • 38
  • 86
  • Can you please provide the expected output and input? – aIKid Dec 17 '13 at 03:17
  • a string that is a Roman numeral return arabic_value – PyNEwbie Dec 17 '13 at 03:19
  • You don't need to create the list of operations. You can just figure out the value of each digit and add them up. – user2357112 Dec 17 '13 at 03:25
  • 1
    The answer to the question in the title is yes, for recent versions of Python - use `ast.literal_eval`. It's the wrong tool for this specific problem, but it's very useful in a lot of other cases. But really, don't try to use it for this. – Peter DeGlopper Dec 17 '13 at 03:27
  • I don't think you are correct. Some 9 year olds helped me analyze the conversion process. They are pretty smart and the function above solves every case that we analyzed – PyNEwbie Dec 17 '13 at 03:27
  • It handles every case I just did not share with you what we do with equal signs. I thought I was pretty explicit. I really enjoy this board until people have to show off how smart they are. The reason the nine year olds are helping is because they are learning to program and if you paid attention you would know that just adding the numbers will not give you the solution. – PyNEwbie Dec 17 '13 at 03:32
  • You may wish to look into some existing solutions, eg: http://rosettacode.org/wiki/Roman_numerals/Decode#Python - the first one is too opaque for my taste, but the second looks good. – Peter DeGlopper Dec 17 '13 at 03:41
  • Solution without eval: http://codereview.stackexchange.com/questions/5091/python-function-to-convert-roman-numerals-to-integers-and-vice-versa – John1024 Dec 17 '13 at 03:41
  • @PeterDeGlopper and John1024 thanks for those but the fun was working through the algorithm with these five kids. They learn Roman numerals in 3rd grade and they learn to parse them by something like if next number is lower subtract. The dictionaries in those answers are more complex then the ones they have in their minds. – PyNEwbie Dec 17 '13 at 03:46
  • eval is dangerous in situations where the person giving you values wouldn't be trusted to run their own python code as the same user doing the eval. If the kids are doing this under their own accounts, with their own input, there's no particular risk. Further, as you're not directly trusting the input, and are actually generating a string from `roman_dict[]` lookups that will raise an exception for unexpected input, there's no risk anyway. – Tony Delroy Dec 17 '13 at 03:49

2 Answers2

6

Is there a way to accomplish what eval does without using eval

Yes, definitely. One option would be to convert the whole thing into an ast tree and parse it yourself (see here for an example).

I am a little bit nervous about this code because at some point I read that eval is dangerous to use in most circumstances as it allows someone to introduce mischief into your system.

This is definitely true. Any time you consider using eval, you need to do some thinking about your particular use-case. The real question is how much do you trust the user and what damage can they do? If you're distributing this as a script and users are only using it on their own computer, then it's really not a problem -- After all, they don't need to inject malicious code into your script to remove their home directory. If you're planning on hosting this on your server, that's a different story entirely ... Then you need to figure out where the string comes from and if there is any way for the user to modify the string in a way that could make it untrusted to run. Hackers are pretty clever1,2 and so hosting something like this on your server is generally not a good idea. (I always assume that the hackers know python WAY better than I do).

1http://blog.delroth.net/2013/03/escaping-a-python-sandbox-ndh-2013-quals-writeup/

2http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html

Community
  • 1
  • 1
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • +1 - I especially like your last sentence in parenthesis. Coders absolutely must think that way. You (the coder) have to think of _every_ attack where as a hacker needs to find just _one_. –  Dec 17 '13 at 03:34
  • Thanks this helped but I think it is too complicated. After reading your solution though I think I have an idea for an answer. Basically I need to test whether or not there is any value in the list submitted to the eval function to determine if there is something other than ('+','-',1,5,10,50,100,500,1000) I can use a set to determine the unique values in some_list and use a set operation to confirm that there are no values in some_list that are not in the set I typed above. I think that will solve the problem and gives me a chance to work with them on set operation. – PyNEwbie Dec 17 '13 at 03:52
1

The only implementation of a safe expression evalulator that I've come across is:

It supports a lot of basic Python-ish expressions and is quite restricted in what it allows you to do (so you don't blow up the interpreter or do something evil). It uses the python ast module for parsing, and evaluates the result itself.

Example:

from simpleeval import simple_eval

simple_eval("21 + 21")

Then you can extend it and give it access to the parts of your program that you want to:

simple_eval("x + y", names={"x": 22, "y": 48})

or

simple_eval("do_thing(11)", functions={"do_thing": my_callback})

and so on.

Daniel
  • 1,410
  • 12
  • 17
James Mills
  • 18,669
  • 3
  • 49
  • 62
  • 1
    None of the `simple_eval` examples explicitly address the security hole that is documented in Ned's blog post that I linked in my answer. `simple_eval` *may* cover that case, but it might not ... I suppose that my point is whenever you use a 3rd party library which claims to be safe, you're putting a lot of trust in the developers. – mgilson Dec 17 '13 at 03:55
  • I agree :) I have studied the code myself fairly fairly though. But of course that's no guarantee :) When doing this kind of thing you should always sanitize the input anyway! – James Mills Dec 17 '13 at 03:58