0
import pandas as pd
import numpy as np
data_A=pd.read_csv('D:/data_A.csv')

data_A has a column named time. enter image description here

dtype of time is int64. But after I run the code below, timehas changed to float type.

data_A.loc[data_A['scan'] == 1., data_A.columns.difference(
    ['scan', 'label', 'level','index'])] = np.nan

data_A.loc[data_A['NH4'] < 0., 'NH4'] = np.nan
data_A.loc[data_A['NH4'] > 10., 'NH4'] = np.nan
data_A.loc[data_A['NH4_Y']<0, 'NH4_Y'] = np.nan
data_A.loc[data_A['NH4_Y']>100, 'NH4_Y'] = np.nan

data_A.loc[data_A['TOC_Y']<0, 'TOC_Y'] = np.nan
data_A.loc[data_A['TOC_Y']>20000, 'TOC_Y'] = np.nan

data_A.loc[data_A['SS_Y']<0, 'SS_Y'] = np.nan
data_A.loc[data_A['SS_Y']>20000, 'SS_Y'] = np.nan

data_A.loc[data_A['TEMP_Y']<0, 'TEMP_Y'] = np.nan

data_A.level.astype(int)

data_A['NH4'].interpolate(method='slinear', inplace = True)

I didn't do anything to the column time,but it changed to float type. enter image description here I want int type for time.Is there any way to make it as int type?

Lee
  • 129
  • 5
  • Have u tried int casting it?? Use int() to do so – Joel Nov 26 '21 at 12:05
  • I tried `data_A.time.astype(int)` , but it gave me an error message like `ValueError: Cannot convert non-finite values (NA or inf) to integer` – Lee Nov 26 '21 at 12:10
  • I think there might be some NA or inf. I want to ignore NA or inf and change other values to int – Lee Nov 26 '21 at 12:12

2 Answers2

0

Try Int Casting it using:

int(<variable>)

Note: Int casting a double or float may lead to a loss of date. Casting floats like 3.0 to a int is safe, but casting floats or doubles like 3.2(which have a decimal number) will result in 3 and a data loss may occur.

Data loss is nothing to worry about if aren't working with decimals.

Joel
  • 239
  • 3
  • 21
0

Your very first line which does a difference and defaults value to np.nan is the culprit.

Numpy.nan is actually float as it's easier for numpy computation. If you need to default it then try a constant like, -99999999.