0

I'm trying to convert the .values that I have into an array that has a function within it, but keep on coming up with an error. Would appreciate the help!

Here is the .values:

Y = df['GDP_growth'].values
array(['3.299991384', '-1.760010328', '5.155440545', '4.019541839',
       '0.801760179', '7.200000003', '3.727818428', '0.883846197'], dtype-object)

Here is the command to make the array that comes out as an error:

Y = np.array([1 if y>= 3 else 0 for y in Y])

In my case, the error is that it all comes out as 1.

DilithiumMatrix
  • 17,795
  • 22
  • 77
  • 119
Jonathan S
  • 29
  • 2
  • 6
  • And what does the error say? `for` should come before `if` in list comprehensions. Also your `np.array` is filled with strings and not numbers. You have a number of things to fix here... – Julien Nov 30 '15 at 05:09
  • nevermind for the `for` / `if` order, read it too quickly. – Julien Nov 30 '15 at 05:21

2 Answers2

0

You could use numpy filtering, but first you need to change type from str or object to float or np.float as you need:

import numpy as np
Y = np.array(['3.299991384', '-1.760010328', '5.155440545', '4.019541839',
   '0.801760179', '7.200000003', '3.727818428', '0.883846197'], dtype=object)
Y = Y.astype(float)

Y[Y<=3] = 0
Y[Y>3] = 1

In [67]: Y
Out[67]: array([ 1.,  0.,  1.,  1.,  0.,  1.,  1.,  0.])

EDIT

If you need some preprocessing to convert your data to numbers values you could use to_numeric and then dropna to the interesting series or to whole dataframe, i.e. for series:

z = pd.Series(Y)
z[0] = 'a'

In [293]: z
Out[293]:
0               a
1    -1.760010328
2     5.155440545
3     4.019541839
4     0.801760179
5     7.200000003
6     3.727818428
7     0.883846197
dtype: object

pd.to_numeric(z, errors='coerce').dropna() 

In [296]: pd.to_numeric(z, errors='coerce').dropna()
Out[296]:
1   -1.760010
2    5.155441
3    4.019542
4    0.801760
5    7.200000
6    3.727818
7    0.883846
dtype: float64  
Community
  • 1
  • 1
Anton Protopopov
  • 30,354
  • 12
  • 88
  • 93
  • Unfortunately it still comes out as this error: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 Y = Y.astype(float) ValueError: could not convert string to float: .. – Jonathan S Nov 30 '15 at 08:38
0

Figured it out! Apparently I had some missing values denoted as '..', so I had to wrangle it out first by dropping those rows - then I can apply .astype

Jonathan S
  • 29
  • 2
  • 6