Is there a Python equivalent to the endptr parameter to C's strtod?

Question

I am trying to write a function that splits a string containing a floating-point number and some units. The string may or may not have spaces between the number and the units.

In C, the function strtod has a very handy parameter, named endptr that allows you to parse-out the initial part of a string, and get a pointer to the remainder. Since this is exactly what I need for this scenario, I was wondering if there is a similar functionality buried somewhere in Python.

Since float itself does not currently offer this functionality, I am using a regex solution based on https://stackoverflow.com/a/4703508/2988730:

float_pattern = re.compile(r'[+-]?(?:(?:\d+\.?)|(?:\d*.\d+))(?:[Ee][+-]?\d+)')
def split_units(string):
    match = float_pattern.match(string)
    if match is None: raise ValueError('not a float')
    num = float(match.group())
    units = string[match.end():].strip()
    return num, units

This is not completely adequate for two reasons. The first is that it reinvents the wheel. The second is that it is not properly locale-aware without adding additional complexity (which is why I don't want to reinvent the wheel in the first place).

For the record, the tail of the string can not contain any characters that a number would contain. The only real issue is that I am not requiring units to be separated from numbers by a space, so doing a simple string.split(maxsplit=1) won't work.

Is there a better way to get a floating point number out of the beginning of the string, so I can process the rest as something else?

`locale.atof`? I've never used it and don't know how comprehensive ti is. — tdelaney, May 21 '18 at 19:26
`float` itself isn't locale-aware. If you want locale-awareness, you want something like [`locale.atof`](https://docs.python.org/3/library/locale.html), which is going to reject some things that `float` accepts. — user2357112, May 21 '18 at 19:26
Yup. It's a dupe. Too bad there aren't any decent answers there... — Mad Physicist, May 21 '18 at 19:34
I closed the question, it's an exact duplicate. I'm afraid you're stuck with regexes. The dupe link proposes to reimplement the function from the C source of float parsing... yeah, why not? — Jean-François Fabre, May 21 '18 at 19:34
@Jean-FrançoisFabre. Probably. I'll post an answer to the other question if I find anything better... — Mad Physicist, May 21 '18 at 19:35
Is it true that the last digit of the number part has to be one of [0-9]? If yes, could you just do a regex search for the first one digit number on the reverse string? — SpghttCd, May 21 '18 at 19:37
check the duplicate link, I posted something that works, and learned ctypes wrapping in the process :) — Jean-François Fabre, May 21 '18 at 19:58

score 0 · Answer 1 · answered May 21 '18 at 19:26

0

I know this is a stupid solution, but how about this:

def float_and_more(something):
    orig = something
    rest = ''
    while something:
        try:
            return float(something), rest                  
        except ValueError:
            rest = something[-1] + rest                    
            something = something[:-1]                     
    raise ValueError('Invalid value: {}'.format(orig))

And you could use it like this:

>>> float_and_more('2.5 meters')
(2.5, 'meters')

If you would want to use this for real, you'd probably use io.StringIO instead of constantly recreating the strings.

answered May 21 '18 at 19:26

L3viathan

26,748
2
58
81

1

hey same idea :) – Jean-François Fabre May 21 '18 at 19:28
1

@Jean-FrançoisFabre. And same comment. Binary search going to work better for long strings, and honestly, this does not sound like a good way to do it at all. – Mad Physicist May 21 '18 at 19:31

Is there a Python equivalent to the endptr parameter to C's strtod?

1 Answers1