0

Data frame=reviews

I get the following errror when I try to convert rating column to integer

''Cannot convert non-finite values (NA or inf) to integer''

how can I fix it?

reviews.replace([np.inf, -np.inf], np.nan)
reviews.dropna() 

reviews['Rating'].astype('int')
Maor Refaeli
  • 2,417
  • 2
  • 19
  • 33
cekik
  • 63
  • 11
  • It's hard for us to tell what the issue is if we don't know what the Dataframe looks like. – dustintheglass Dec 23 '18 at 12:11
  • Determine what the non-numeric values are, and where they come from. Determine what integer representation would be appropriate. Code that! – Paddy3118 Dec 23 '18 at 12:11
  • @ Gokce , you should accept the answer as that helps to see it as answered and removed from the un-answered queue , you can also upvote – Karn Kumar Dec 23 '18 at 13:24

2 Answers2

1

The simplest way would be to first replace infs to NaN and then use dropna :

Example DataFrame:

>>> df = pd.DataFrame({'col1':[1, 2, 3, 4, 5, np.inf, -np.inf], 'col2':[6, 7, 8, 9, 10, np.inf, -np.inf]})

>>> df
       col1       col2
0  1.000000   6.000000
1  2.000000   7.000000
2  3.000000   8.000000
3  4.000000   9.000000
4  5.000000  10.000000
5       inf        inf
6      -inf       -inf

Solution 1:

Create a df_new that way you will not loose the real dataframe and desired dataFrame will ne df_new separately..

>>> df_new = df.replace([np.inf, -np.inf], np.nan).dropna(subset=["col1", "col2"], how="all").astype(int)
>>> df_new
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

Solution 2:

using isin and ~ :

>>> ff = df.isin([np.inf, -np.inf, np.nan]).all(axis='columns')
>>> df[~ff].astype(int)
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

OR Directly into original Dataframe, Use pd.DataFrame.isin and check for rows that have any with pd.DataFrame.any. Finally, use the boolean array to slice the dataframe.

>>> df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)].astype(int)
>>> df
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

above taken from here courtesy to the @piRSquared

Solution 3:

You have liberty to use dataFrame.mask + numpy.isinf and the using dronna():

>>> df = df.mask(np.isinf(df)).dropna().astype(int)
>>> df
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10
Karn Kumar
  • 8,518
  • 3
  • 27
  • 53
0

Both .replace() and .dropna() do not perform their actions in place, e.g. modify the existing dataframe unless you specify them to. However if you do specify to perform them in place your code would work:

reviews.replace([np.inf, -np.inf], np.nan, inplace=True)
reviews.dropna(inplace=True) 

reviews['Rating'].astype('int')

Or:

reviews = reviews.replace([np.inf, -np.inf], np.nan)
reviews = reviews.dropna() 

reviews['Rating'].astype('int')
gosuto
  • 5,422
  • 6
  • 36
  • 57