1

I've got a dataframe column GDP/year from a dataset about suicides over some years. The data type of this column is currently object (string), but I want it as integer.

The values are commas separated so I can't directly transform them to integers. I tried string-removing the commas, storing as integer, then I introduce the commas again, but its the type reverts back to object.

The dataset: https://www.kaggle.com/russellyates88/suicide-rates-overview-1985-to-2016

# convert to int...
suicides[' gdp_for_year ($) '] = suicides[' gdp_for_year ($) '].str.replace(',','').astype(int) 
# now reformat with commas as thousands separator...
suicides[' gdp_for_year ($) '] = suicides[' gdp_for_year ($) '].astype(int).apply(lambda x: "{:,}".format(x)) 
# ...wanted to get dtype integer, but it's back to object
smci
  • 32,567
  • 20
  • 113
  • 146
Andreea Elena
  • 135
  • 1
  • 8
  • 1
    I am not sure why you run the second line of the code? The first converts the data to `int` dtype which is what you want. The second line will convert it it back to string as @rusu_ro1 said – user7440787 Aug 30 '19 at 19:39
  • I think you should add also pandas tag – kederrac Aug 30 '19 at 19:48
  • Type "object" means "string". Now, you need to **distinguish between the underlying data (e.g. the integer 1234) and its (string) representation e.g. `1,234`**. pandas allows you to define custom formatters on a per-column basis, which is what you're asking for here. **You can (and should) store integer data as integer data, just define a custom formatter for it**. As for the underlying code for [How to print number with commas as thousands separators?](https://stackoverflow.com/questions/1823058/how-to-print-number-with-commas-as-thousands-separators) – smci Aug 30 '19 at 19:52
  • Related: [How to display pandas DataFrame of floats using a format string for columns?](https://stackoverflow.com/questions/1823058/how-to-print-number-with-commas-as-thousands-separators) – smci Aug 30 '19 at 20:17

1 Answers1

3

you are converting to string each element : "{:,}".format(x)

but I guess you want to display your numbers in your pandas DataFrame to show comma separators by default, for this you can do it but for float data type:

pd.options.display.float_format = '{:,}'.format

if you want also for int type you should monkey-patch pandas.io.formats.format.IntArrayFormatter .

kederrac
  • 16,819
  • 6
  • 32
  • 55
  • 2
    Note this causes **all** columns to use `,` as thousands separator. OP said they only wanted the formatting on this one specific column. So you need a custom formatter for this column. – smci Aug 30 '19 at 20:15