using mathematical functions on a regex group

Question

For a class, I have to write a function that takes times of the form 03:12:19 (in other words, three hours, twelve minutes, and nineteen seconds) and converts them to the corresponding number of seconds. I have started but can't seem to get the math to work, here is the code i have at the moment:

def secs(timestr):
    import re
    timexp = re.compile('(\d\d):(\d\d):(\d\d)')
    calc = re.sub(timexp,r'int(\1)*3600+int(\2*60)+int(\3)',timestr)
    return print(calc)

str = '03:20:13'
secs(str)

I've messed around with removing r but it gives me a weird result. help?

Is `return print` a typo here for `return exec`? – jscs Aug 11 '13 at 21:04 — jscs, Aug 11 '13 at 21:04
no i just want the function to print the substitution – kflaw Aug 11 '13 at 21:06 — kflaw, Aug 11 '13 at 21:06

Peter DeGlopper · Accepted Answer · 2013-08-11T21:30:30.503

Regexps are probably overkill for parsing the input string, and entirely the wrong tool for calculating the total number of seconds. Here's a simple replacement:

def secs(timestr):
    hours, minutes, seconds = timestr.split(':')
    return int(hours) * 3600 + int(minutes) * 60 + int(seconds)

This doesn't handle error checking (not the right number of ':' dividers, non-digit contents, etc) but then neither does your original regexp approach. If you do need to sanity check the input, I'd do it like this:

def secs(timestr):
    timeparts = timestr.split(':')
    if len(timeparts) == 3 and all((part.isdigit() for part in timeparts)):
        return int(timeparts[0]) * 3600 + int(timeparts[1] * 60 + int(timeparts[2])
    else:
        # not a matching string - do whatever you like.
        return None

There are other approaches.

If you want a string rather than integer for the number of seconds, return str(int(hours) * 3600 + int(minutes) * 60 + int(seconds)).

Edit: in response to " i was instructed to do this with a regexp substitution so that is what i must do":

re.sub can take two different kinds of replacement arguments. You can either provide a string pattern or a function to calculate the replacement string. String patterns do not do math, so you must use a function.

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

def _calculate_seconds(timematch):
    return str(int(timematch.group(1)) * 3600 + int(timematch.group(2)) * 60 + int(timematch.group(3)))

def secs(timestr):
    timexp = re.compile(r'(\d{1,2}):(\d{1,2}):(\d{1,2})')
    return re.sub(timexp, _calculate_seconds, timestr)

But this is a bad approach unless you're trying to convert multiple occurrences of these time patterns in a single larger string.

Compiling the regexp isn't really necessary or even helpful here, since you redo it each time you call the function. The usual approach is to compile it outside the function - but as the regexp docs note:

The compiled versions of the most recent patterns passed to re.match(), re.search() or re.compile() are cached, so programs that use only a few regular expressions at a time needn’t worry about compiling regular expressions.

Still, it's a good habit to get into. Just not inside the local function definition like this.

that is all fine and good but i was instructed to do this with a regexp substitution so that is what i must do — kflaw, Aug 11 '13 at 21:11
@kflaw that would be a pointless and horrible assignment... Are you sure you haven't mis-understood the requirements? — Jon Clements, Aug 11 '13 at 21:15
That's a weird instruction - regexps are not a good tool for this. But I'll add in some information on what options you do have. — Peter DeGlopper, Aug 11 '13 at 21:16
so that I fully understand, when i run a variable called say 'orig_time' through the secs function, it calls the _calculate_seconds function which also uses 'orig_time' as the variable 'timematch'? Is it possible to do — kflaw, Aug 12 '13 at 16:46
It receives a match object: http://docs.python.org/2/library/re.html#match-objects As noted there, the original text that you matched against is available as the `.string` attribute of the match object. — Peter DeGlopper, Aug 12 '13 at 16:52
sorry i am new with python, why dont we have to use re.match or re. search first? — kflaw, Aug 12 '13 at 16:55

score 1 · Answer 2 · answered Aug 11 '13 at 21:06

1

You're using re.sub, which replaces regex matches with the second argument.

Instead, you should run re.match(timexp, timestr) to get a match object. This object has an API for accessing the groups (the parenthesized parts of the regex): match.group(0) is the whole string, match.group(1) is the first two-digit block, match.group(2) is the second, ...

You can process the numbers in memory from there.

answered Aug 11 '13 at 21:06

disatisfieddinosaur

1,502
10
15

ok I think I am following this will see what I can come up with, thanks! – kflaw Aug 11 '13 at 21:13

score 0 · Answer 3 · edited May 23 '17 at 11:49

Another option would be to try slicing. (Here's some info on slicing notation: Explain Python's slice notation)

If the time being passed into the function is always in the same format (i.e. hh:mm:ss) then slicing would allow you to pick apart each component of the time. Slicing the string would still return a string, hence the use of int() with each sliced component of time. The secs function would then look something like this:

def secs(timestr):
    hours = int(timestr[:2])
    minutes = int(timestr[3:5])
    seconds = int(timestr[6:])
    totalsecs = hours * 3600 + minutes * 60 + seconds
    return totalsecs

using mathematical functions on a regex group

3 Answers3