7

I'm making a program that takes currency from a string and converts it in to other currencies. For example, if the string was 'the car cost me $13,250' I would need to get $ and 13250. I have this regex already (?:\£|\$|\€)(?:.{1,}) that sort of does it, however there is a reasonably large possibility that the string might have more than one price, all using different currencies. This is something that I do no know how to do effectively.

What I need to know is how to extract all of the prices from a string. I think even if the regex just returns something like ['$12,250,000','£14,500,123','£120.25'] then it is fine because I can use something like this to get the number:

prices = ['$12,250','£14,500','£120']
for value in prices:
    value.replace(',','')

And something like this to get the currency:

for c in prices:
     currency = c[0]

Then there is the problem that the price might not be a whole number, and might be something like $12.54. Any help on how to get that initial list of prices would be great.

John Yuki
  • 180
  • 2
  • 4
  • 14
  • Possible duplicate of [Parse currency into numbers in Python](https://stackoverflow.com/questions/37580151/parse-currency-into-numbers-in-python) – Software2 Sep 11 '17 at 20:35
  • No I am fine with converting currency in to numbers as I showed, but I need to know how I can extract all of the price values from a string first. – John Yuki Sep 11 '17 at 20:41
  • use `re.findall`? Something like `re.findall(r"(?:\£|\$|\€)(?:[\d\.\,]{1,})",s)` It won't be perfect, but maybe it will be easier to just filter false-positives later – juanpa.arrivillaga Sep 11 '17 at 20:43
  • This is a duplicate. The problem you are having, and the solution to your problem (with a working example) are both in the link I posted. – Software2 Sep 11 '17 at 20:49

4 Answers4

12

This regular expression will work better for your purposes:

(?:[\£\$\€]{1}[,\d]+.?\d*)

Try it out here.

Then as sainoba notes, you can use re.findall or re.finditer to get the matches.

Then you can extract the currency from the first character, remove commas, and finally split on a decimal point if needed.

bphi
  • 3,115
  • 3
  • 23
  • 36
  • 1
    Thanks, this is exactly what I needed. – John Yuki Sep 11 '17 at 20:52
  • Works great when no other solutions worked and saved me time in having to write a regex, thank you for this! – Edward B. Jun 22 '20 at 03:57
  • The above is almost correct: characters after the number will be captured, too, if the value does not have a decimal point. I added a backslash to match only that: `(?:[\£\$\€]{1}[,\d]+\.?\d*)` – Stefano Frazzetto Nov 26 '20 at 14:58
1

When dealing with currencies, you cannot use simple approaches like replacing commas and periods. There are a multitude of language and regional differences. The Euro may use commas or periods as the decimal separator. Some locales may have two or three digits between grouping separators. The currency symbol may be on the left or right. A symbol may represent any one of a dozen different currencies, depending on the user's locale.

Make use of a library to handle this work for you. This issue has been discussed in detail in other posts, such as this one.

Software2
  • 2,358
  • 1
  • 18
  • 28
0
import re
re.findall('£{1}[,0-9]{1,10}',values)
Shahad
  • 23
  • 1
  • 10
-2

You could use re.findall or re.finditer:

re.findall(pattern, string) returns a list of matching strings.

re.finditer(pattern, string) returns an iterator.

sainoba
  • 158
  • 1
  • 13