104

I have a string that represents a number which uses commas to separate thousands. How can I convert this to a number in python?

>>> int("1,000,000")

Generates a ValueError.

I could replace the commas with empty strings before I try to convert it, but that feels wrong somehow. Is there a better way?


For float values, see How can I convert a string with dot and comma into a float in Python, although the techniques are essentially the same.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
dsimard
  • 4,245
  • 5
  • 22
  • 16

11 Answers11

123
import locale
locale.setlocale( locale.LC_ALL, 'en_US.UTF-8' ) 
locale.atoi('1,000,000')
# 1000000
locale.atof('1,000,000.53')
# 1000000.53
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 1
    I think the guru means something like this: locale.setlocale(locale.LC_ALL, 'en_US.UTF-8') – mbarkhau Nov 22 '09 at 17:40
  • Very nice. This way I can handle european numbers where the commas and points are switched too. Thanks. – dsimard Nov 22 '09 at 18:39
  • 7
    I get locale error: `Traceback (most recent call last): File "F:\test\locale_num.py", line 2, in locale.setlocale( locale.LC_ALL, 'en_US.UTF-8' ) File "F:\Python27\lib\locale.py", line 539, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting` – Tony Veijalainen Oct 05 '12 at 12:06
  • @TonyVeijalainen: On linux you can use `locale -a` to find what locales are available on your system. For Windows, try this [SO answer](http://stackoverflow.com/a/956084/190597). – unutbu Oct 05 '12 at 18:51
  • Where did you find out that it was `'en_US.UTF-8'`? (It's correct - I just want to know for future reference.) It doesn't show up in a Google search of the python.org website, nor is there any list on the locale doc page: https://docs.python.org/2/library/locale.html EDIT: I see that you can find the locales with `locale -a`... so the interpreter gets the information about locales from the OS itself? – Chris Middleton May 11 '14 at 20:38
  • @AmadeusDrZaius: On Linux the locales are [provided by glibc](https://sourceware.org/glibc/wiki/Locales). – unutbu May 11 '14 at 21:24
  • Is there any way, similar to this answer, to convert a string with commas to a decimal? – Elias Zamaria Oct 30 '16 at 16:49
  • @EliasZamaria: I don't think there is a builtin function, but you could use replace to remove the commas: `decimal.Decimal('123,456.789'.replace(',',''))`. – unutbu Oct 30 '16 at 18:53
  • @unutbu, thanks. I know I can do that. I was just wondering if there was a simpler way, similar to your answer. – Elias Zamaria Oct 30 '16 at 19:24
  • This did not work for me, but this did - https://stackoverflow.com/questions/48843193/convert-a-number-using-atof/48845430#48845430 – Jan Feb 17 '18 at 20:12
48

There are several ways to parse numbers with thousands separators. And I doubt that the way described by @unutbu is the best in all cases. That's why I list other ways too.

  1. The proper place to call setlocale() is in __main__ module. It's global setting and will affect the whole program and even C extensions (although note that LC_NUMERIC setting is not set at system level, but is emulated by Python). Read caveats in documentation and think twice before going this way. It's probably OK in single application, but never use it in libraries for wide audience. Probably you shoud avoid requesting locale with some particular charset encoding, since it might not be available on some systems.

  2. Use one of third party libraries for internationalization. For example PyICU allows using any available locale wihtout affecting the whole process (and even parsing numbers with particular thousands separators without using locales):

    NumberFormat.createInstance(Locale('en_US')).parse("1,000,000").getLong()

  3. Write your own parsing function, if you don't what to install third party libraries to do it "right way". It can be as simple as int(data.replace(',', '')) when strict validation is not needed.

jgritty
  • 11,660
  • 3
  • 38
  • 60
Denis Otkidach
  • 32,032
  • 8
  • 79
  • 100
  • 2
    +1 for recommending the simple way. That's all I needed when I had this same problem. – Michael Kristofik Jul 11 '11 at 14:21
  • Edited to fix a typo (`setlocate` should be `setlocale`). Also, +1. – Mark Dickinson Apr 28 '14 at 15:31
  • Shameless self-promotion, I did use the third option. So if someone is interested, have a look at [**this question/answer**](https://stackoverflow.com/questions/48843193/convert-a-number-using-atof/48845430#48845430) – Jan Feb 17 '18 at 20:11
20

Replace the commas with empty strings, and turn the resulting string into an int or a float.

>>> a = '1,000,000'
>>> int(a.replace(',' , ''))
1000000
>>> float(a.replace(',' , ''))
1000000.0
Taran
  • 12,822
  • 3
  • 43
  • 47
Cody Piersall
  • 8,312
  • 2
  • 43
  • 57
  • 23
    Please, read again the OP question. In particular where he says: "I could replace the commas with empty strings before I try to convert it, but that feels wrong somehow. Is there a better way?" – joaquin Mar 27 '14 at 12:23
3

I got locale error from accepted answer, but the following change works here in Finland (Windows XP):

import locale
locale.setlocale( locale.LC_ALL, 'english_USA' )
print locale.atoi('1,000,000')
# 1000000
print locale.atof('1,000,000.53')
# 1000000.53
Rejected
  • 4,445
  • 2
  • 25
  • 42
Tony Veijalainen
  • 5,447
  • 23
  • 31
3

This works:

(A dirty but quick way)

>>> a='-1,234,567,89.0123'
>>> "".join(a.split(","))
'-123456789.0123'
Wizmann
  • 839
  • 1
  • 9
  • 14
1

I tried this. It goes a bit beyond the question: You get an input. It will be converted to string first (if it is a list, for example from Beautiful soup); then to int, then to float.

It goes as far as it can get. In worst case, it returns everything unconverted as string.

def to_normal(soupCell):
    ''' converts a html cell from beautiful soup to text, then to int, then to float: as far as it gets.
    US thousands separators are taken into account.
    needs import locale'''
    
    locale.setlocale( locale.LC_ALL, 'english_USA' ) 

    output = unicode(soupCell.findAll(text=True)[0].string)
    try: 
        return locale.atoi(output)
    except ValueError: 
        try: return locale.atof(output)
        except ValueError:
            return output
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
Anderas
  • 630
  • 9
  • 20
0
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "")
'en_US.UTF-8'
>>> print locale.atoi('1,000,000')
1000000
>>> print locale.atof('1,000,000.53')
1000000.53

this is done on Linux in US.

cs95
  • 379,657
  • 97
  • 704
  • 746
Suresh
  • 17
  • 1
0

A little late, but the babel library has parse_decimal and parse_number which do exactly what you want:

from babel.numbers import parse_decimal, parse_number
parse_decimal('10,3453', locale='es_ES')
>>> Decimal('10.3453')
parse_number('20.457', locale='es_ES')
>>> 20457
parse_decimal('10,3453', locale='es_MX')
>>> Decimal('103453')

You can also pass a Locale class instead of a string:

from babel import Locale
parse_decimal('10,3453', locale=Locale('es_MX'))
>>> Decimal('103453')
jcf
  • 602
  • 1
  • 6
  • 26
0

If you're using pandas and you're trying to parse a CSV that includes numbers with a comma for thousands separators, you can just pass the keyword argument thousands=',' like so:

df = pd.read_csv('your_file.csv', thousands=',')
Zoltán
  • 21,321
  • 14
  • 93
  • 134
0

Not the shortest solution, but for the sake of completeness and maybe interesting if you want to rely on an existing function that has been proven a million times: you can leverage pandas by injecting your number as StringIO to its read_csv() function (it has a C backend, so the conversion functionality cannot be leveraged directly - as far as I know).

>>> float(pd.read_csv(StringIO("1,000.23"), sep=";", thousands=",", header=None)[0])
1000.23
Robert
  • 1,357
  • 15
  • 26
-1

Try this:

def changenum(data):
    foo = ""
    for i in list(data):
        if i == ",":
            continue
        else:
            foo += i
    return  float(int(foo))
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
tintin
  • 1
  • 2
  • 2
    Some explanation to go with that code? A bowl of soup is usually served with a soup spoon – cs95 Jan 16 '19 at 03:53