3

I have this string:

-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)

but actually I have a lot of string like this:

a*p**(-1.0) + b*p**(c)

where a,b and c are double. And I would like to extract a,b and c of this string. How can I do this using Python?

Anthon
  • 69,918
  • 32
  • 186
  • 246
Guy Davis
  • 31
  • 1

6 Answers6

3
import re
s = '-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)'
pattern = r'-?\d+\.\d*'  

a,_,b,c = re.findall(pattern,s)
print(a, b, c)

Output

('-1007.88670550662', '67293.8347365694', '-0.416543501823503')

s is your test strings and what not, pattern is the regex pattern, we are looking for floats, and once we find them using findall() we assign them back to a,b,c

Note this method works only if your string is in format of what you've given. else you can play with the pattern to match what you want.

Edit like most people stated in the comments if you need to include a + in front of your positive numbers you can use this pattern r'[-+]?\d+\.\d*'

MooingRawr
  • 4,901
  • 3
  • 24
  • 31
  • 1
    You probably want `pattern = r'-?\d+\.\d+'` to catch the possible minus sign. –  Apr 21 '17 at 15:58
  • and a possible plus sign – quasoft Apr 21 '17 at 15:59
  • Doesn't this matches `-1.0`? – Pedro Lobito Apr 21 '17 at 16:01
  • @PedroLobito it does and i disregard it, like my answer states, if the string is given in that exact format as in `a*p**(-1.0) + b*p**(c)` like op stated, then it has no issues – MooingRawr Apr 21 '17 at 16:02
  • @PedroLobito: The `-1.0` is assigned to `_`, which is used as a throw-away variable by convention. –  Apr 21 '17 at 16:08
  • Also `-1007.` could be a valid floating point number. You can change the pattern to `r'[-+]?\d+\.\d*'` to account for that – quasoft Apr 21 '17 at 16:10
  • @quasoft while you are correct about the `-1007.`, which I added thanks :D, I feel the need for the `+` in front is not really needed, since OP didn't state it, and the little sample given doesn't show a `+` for positive numbers, I will leave a comment about it though. – MooingRawr Apr 21 '17 at 16:13
1

Using the reqular expression

(-?\d+\.?\d*)\*p\*\*\(-1\.0\)\s*\+\s*(-?\d+\.?\d*)\*p\*\*\((-?\d+\.?\d*)\)

We can do

import re

pat = r'(-?\d+\.?\d*)\*p\*\*\(-1\.0\)\s*\+\s*(-?\d+\.?\d*)\*p\*\*\((-?\d+\.?\d*)\)'

regex = re.compile(pat)

print(regex.findall('-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)'))

will print [('-1007.88670550662', '67293.8347365694', '-0.416543501823503')]

Patrick Haugh
  • 59,226
  • 13
  • 88
  • 96
1

If your formats are consistent, and you don't want to deep dive into regex (check out regex101 for this, btw) you could just split your way through it.

Here's a start:

>>> s= "-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)"
>>> a, buf, c = s.split("*p**")
>>> b = buf.split()[-1]
>>> a,b,c
('-1007.88670550662', '67293.8347365694', '(-0.416543501823503)')
>>> [float(x.strip("()")) for x in (a,b,c)]
[-1007.88670550662, 67293.8347365694, -0.416543501823503]
Russ
  • 10,835
  • 12
  • 42
  • 57
  • This is good idea for a solution without regex. It could be simplified like this `s.replace('*p**(-1.0) +', '*p**').split('*p**')` – quasoft Apr 21 '17 at 16:44
1

The re module can certainly be made to work for this, although as some of the comments on the other answers have pointed out, the corner cases can be interesting -- decimal points, plus and minus signs, etc. It could be even more interesting; e.g. can one of your numbers be imaginary?

Anyway, if your string is always a valid Python expression, you can use Python's built-in tools to process it. Here is a good generic explanation about the ast module's NodeVisitor class. To use it for your example is quite simple:

import ast

x = "-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)"

def getnums(s):
    result = []
    class GetNums(ast.NodeVisitor):
        def visit_Num(self, node):
            result.append(node.n)
        def visit_UnaryOp(self, node):
            if (isinstance(node.op, ast.USub) and
                isinstance(node.operand, ast.Num)):
                result.append(-node.operand.n)
            else:
                ast.NodeVisitor.generic_visit(self, node)
    GetNums().visit(ast.parse(s))
    return result

print(getnums(x))

This will return a list with all the numbers in your expression:

[-1007.88670550662, -1.0, 67293.8347365694, -0.416543501823503]

The visit_UnaryOp method is only required for Python 3.x.

Community
  • 1
  • 1
Patrick Maupin
  • 8,024
  • 2
  • 23
  • 42
0

You can use something like:

import re
a,_,b,c = re.findall(r"[\d\-.]+", subject)
print(a,b,c)

Demo

Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
0

While I prefer MooingRawr's answer as it is simple, I would extend it a bit to cover more situations.

A floating point number can be converted to string with surprising variety of formats:

  • Exponential format (eg. 2.0e+07)
  • Without leading digit (eg. .5, which is equal to 0.5)
  • Without trailing digit (eg. 5., which is equal to 5)
  • Positive numbers with plus sign (eg. +5, which is equal to 5)
  • Numbers without decimal part (integers) (eg. 0 or 5)

Script

import re

test_values = [
    '-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)',
    '-2.000e+07*p**(-1.0) + 1.23e+07*p**(-5e+07)',
    '+2.*p**(-1.0) + -1.*p**(5)',
    '0*p**(-1.0) + .123*p**(7.89)'
]

pattern = r'([-+]?\.?\d+\.?\d*(?:[eE][-+]?\d+)?)'

for value in test_values:
    print("Test with '%s':" % value)
    matches = re.findall(pattern, value)
    del matches[1]
    print(matches, end='\n\n')

Output:

Test with '-1007.88670550662*p**(-1.0) + 67293.8347365694*p**(-0.416543501823503)':
['-1007.88670550662', '67293.8347365694', '-0.416543501823503']

Test with '-2.000e+07*p**(-1.0) + 1.23e+07*p**(-5e+07)':
['-2.000e+07', '1.23e+07', '-5e+07']

Test with '+2.*p**(-1.0) + -1.*p**(5)':
['+2.', '-1.', '5']

Test with '0*p**(-1.0) + .123*p**(7.89)':
['0', '.123', '7.89']
quasoft
  • 5,291
  • 1
  • 33
  • 37