For some reason, when I import my csv
file with pd.read_csv
, one of my integer columns (number of followers) is read in scientific notation, even though my values are whole numbers and clearly not in scientific notation.
See below what I see when I call df["num_followers"].describe()
I've looked at all the answers for "suppress scientific notation" on here but haven't found any solution that works.
df['num_followers'].apply(lambda x: '{:.2f}'.format(x))
simply turned my values to str
. I tried converting to astype("float")
with no success, values are still in scientific notation, which is messing up my calculations. Any ideas how I can change it to int
?
count 1.200000e+02
mean 4.959472e+04
std 3.816126e+05
min 0.000000e+00
25% 6.725000e+01
50% 2.165000e+02
75% 5.932500e+02
max 4.021842e+06
Name: num_followers, dtype: float64
EDIT
I tried one of the answers below, also to no success:
IN: df_train = pd.read_csv("social_media_train.csv", index_col = [0])
df_train["num_followers"].describe()
OUT: count 5.760000e+02
mean 8.530724e+04
std 9.101485e+05
min 0.000000e+00
25% 3.900000e+01
50% 1.505000e+02
75% 7.160000e+02
max 1.533854e+07
Name: num_followers, dtype: float64
IN: df_train['num_followers'] = df_train['num_followers'].apply(np.int64)
df_train["num_followers"].describe()
OUT:count 5.760000e+02
mean 8.530724e+04
std 9.101485e+05
min 0.000000e+00
25% 3.900000e+01
50% 1.505000e+02
75% 7.160000e+02
max 1.533854e+07
Name: num_followers, dtype: float64