1

I did this

cf = df.iloc[:,1:12]
cf = cf.values
print(cf)

which gives me

[['$0.00 ' '$771.98 ' '$0.00 ' ..., '$771.98 ' '$0.00 ' '$1,543.96 ']
 ['$1,320.83 ' '$4,782.33 ' '$1,320.83 ' ..., '$1,954.45 ' '$0.00 '
  '$1,954.45 ']
 ['$2,043.61 ' '$0.00 ' '$4,087.22 ' ..., '$4,662.30 ' '$2,907.82 '
  '$1,549.53 ']
 ..., 
 ['$427.60 ' '$0.00 ' '$427.60 ' ..., '$427.60 ' '$0.00 ' '$427.60 ']
 ['$868.58 ' '$1,737.16 ' '$0.00 ' ..., '$868.58 ' '$868.58 ' '$868.58 ']
 ['$0.00 ' '$1,590.07 ' '$0.00 ' ..., '$787.75 ' '$0.00 ' '$0.00 ']]

I need these to be of floating types. This is not a possible duplicate since the cf variable is an NDarray not a data frame.

I tried doing this:

cf = df.iloc[:,1:12].replace('[\$,]', '', regex=True).astype(float)
cf = cf.values
print(cf)

But I get these errors:

ValueError                                Traceback (most recent call last)
<ipython-input-152-f5009cb31652> in <module>()
      1 # Place as_of_date and cash flows into an unordered_map or dictionary
----> 2 cf = df.iloc[:,1:12].replace('[\$,]', '', regex=True).astype(float)
      3 cf = cf.values
      4 print(cf)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
     89                 else:
     90                     kwargs[new_arg_name] = new_arg_value
---> 91             return func(*args, **kwargs)
     92         return wrapper
     93     return _deprecate_kwarg

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors, **kwargs)
   3408         # else, only a single dtype is given
   3409         new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 3410                                      **kwargs)
   3411         return self._constructor(new_data).__finalize__(self)
   3412 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in astype(self, dtype, **kwargs)
   3222 
   3223     def astype(self, dtype, **kwargs):
-> 3224         return self.apply('astype', dtype=dtype, **kwargs)
   3225 
   3226     def convert(self, **kwargs):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   3089 
   3090             kwargs['mgr'] = self
-> 3091             applied = getattr(b, f)(**kwargs)
   3092             result_blocks = _extend_blocks(applied, result_blocks)
   3093 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in astype(self, dtype, copy, errors, values, **kwargs)
    469     def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
    470         return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 471                             **kwargs)
    472 
    473     def _astype(self, dtype, copy=False, errors='raise', values=None,

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in _astype(self, dtype, copy, errors, values, klass, mgr, raise_on_error, **kwargs)
    519 
    520                 # _astype_nansafe works fine with 1-d only
--> 521                 values = astype_nansafe(values.ravel(), dtype, copy=True)
    522                 values = values.reshape(self.shape)
    523 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy)
    634 
    635     if copy:
--> 636         return arr.astype(dtype)
    637     return arr.view(dtype)
    638 

ValueError: could not convert string to float: '(641.99)'

I am not sure how to fix this, please revise answers so I can put this to rest and move on to something else.

From the suggested answer I did this

cf = df.iloc[:,1:12].replace('[^0-9]', '', regex=True).astype(float)
cf = cf.values
print(cf)

which gives me this

[[      0.   77198.       0. ...,   77198.       0.  154396.]
 [ 132083.  478233.  132083. ...,  195445.       0.  195445.]
 [ 204361.       0.  408722. ...,  466230.  290782.  154953.]
 ..., 
 [  42760.       0.   42760. ...,   42760.       0.   42760.]
 [  86858.  173716.       0. ...,   86858.   86858.   86858.]
 [      0.  159007.       0. ...,   78775.       0.       0.]]

The values are incorrect and need to be adjusted.

  • 1
    Possible duplicate of [converting currency with $ to numbers in Python pandas](https://stackoverflow.com/questions/32464280/converting-currency-with-to-numbers-in-python-pandas) – Alex Nov 16 '18 at 00:04
  • Possible duplicate of [Get a subset of a data frame into a matrix](https://stackoverflow.com/questions/53329187/get-a-subset-of-a-data-frame-into-a-matrix) – Evan Nov 16 '18 at 00:13
  • I answered this in your other question. https://stackoverflow.com/questions/53329187/get-a-subset-of-a-data-frame-into-a-matrix/53329291#53329291 – Evan Nov 16 '18 at 00:13
  • `cf` is a DataFrame before you make it not one. Use `cf = cf.replace('[\$,]', '', regex=True).astype(float)` before `cf = cf.values` – Alex Nov 16 '18 at 00:21

1 Answers1

1

You could do:

print(df.replace('[\$,]', '', regex=True).astype(float))

Then you'll get the desired one.

Update:

DO:

print(df.replace('[^0-9.]', '', regex=True).astype(float))

Then:

print(df)

IS as desired.

U13-Forward
  • 69,221
  • 14
  • 89
  • 114