0

How can I convert price columns to an integer?

code:

car_sales["Total Sales"] = car_sales["Price"].astype(int).cumsum()
car_sales

error:

ValueError                                Traceback (most recent call last)
<ipython-input-124-b84f0a711067> in <module>
----> 1 car_sales["Total Sales"] = car_sales["Price"].astype(int).cumsum()
      2 car_sales

~\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   5696         else:
   5697             # else, only a single dtype is given
-> 5698             new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors)
   5699             return self._constructor(new_data).__finalize__(self)
   5700 

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    580 
    581     def astype(self, dtype, copy: bool = False, errors: str = "raise"):
--> 582         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    583 
    584     def convert(self, **kwargs):

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, filter, **kwargs)
    440                 applied = b.apply(f, **kwargs)
    441             else:
--> 442                 applied = getattr(b, f)(**kwargs)
    443             result_blocks = _extend_blocks(applied, result_blocks)
    444 

~\anaconda3\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
    623             vals1d = values.ravel()
    624             try:
--> 625                 values = astype_nansafe(vals1d, dtype, copy=True)
    626             except (ValueError, TypeError):
    627                 # e.g. astype_nansafe can fail on object-dtype of strings

~\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
    872         # work around NumPy brokenness, #1987
    873         if np.issubdtype(dtype.type, np.integer):
--> 874             return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
    875 
    876         # if we have a datetime/timedelta array of objects

pandas\_libs\lib.pyx in pandas._libs.lib.astype_intsafe()

ValueError: invalid literal for int() with base 10: ' 4 00'
David Erickson
  • 16,433
  • 2
  • 19
  • 35
  • Welcome to StackOverflow! Please read up on this. People won't be able to help you as effectively otherwise. https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – David Erickson Jun 18 '20 at 22:23
  • Also, read the error message! It is screaming a hint at you as to what the problem is! HINT: `ValueError: invalid literal for int() with base 10: ' 4 00'` So, based off this information... there is a value in your dataset... that Pandas does not know how to convert to an int. That value is ' 4 00', which is a string. Depending on what ' 4 00' should equal (perhaps `4.00` or `400`), you are going to have to have some additional logic to clean up your data before you can apply `int` format. – David Erickson Jun 18 '20 at 22:25
  • thanks appreciate it – Debasis Paul Jun 18 '20 at 22:31

1 Answers1

0

There exists a to_numeric function in pandas. See here.

car_sales["Total Sales"] = pd.to_numeric(car_sales["Price"], errors='coerce').cumsum()

This does return nan for 4 00 however so you must be careful. Follow what David Erickson said.

As an example if it is all spaces are supposed to be decimals, then

car_sales["Price"].str.replace(' ', '.')

should work if done before the conversion from an object.

bbd108
  • 958
  • 2
  • 10
  • 26