Finding a price value inside a unicode text

Asked Jan 19 '17 at 21:51

Active Jan 19 '17 at 21:51

Viewed 35 times

I'd like to extract multiple price values from a unicode text which may have multiple currencies and prefixes right before the value itself. Possible situations are:

An apple costs: 1,01 €
2€ for an apple
The $1.21 apple
...

So the most likely prefixes are whitespace, €/$/etc, \n and a whitespace is mostly closing the range of the value.

There are a bunch of questions about finding a string between two other strings - unfortunately nothing worked for me yet, like this:

result = re.findall(r'\s+(.*?)€\s', lowerCaseDescrip, re.DOTALL)

Maybe using re isn't the best solution for this situation?

edited May 23 '17 at 12:33

Community

asked Jan 19 '17 at 21:51

user3191334

1,148
3
15
33

I suggested a duplicate. Although it doesn't search for € as a suffix or a comma separator, your code above convinces me that you can make those adaptations. – Prune Jan 19 '17 at 22:07
It looks like in your example that `€` can only be a **postfix** of the price, `$` is however a **prefix** – Sash Sinha Jan 19 '17 at 22:07
Thanks and sorry! Yes it's duplicate. – user3191334 Jan 19 '17 at 22:12

Finding a price value inside a unicode text

0 Answers0