7

i am a beginner in python and trying to get the row from the data set which has highest idmb rating and highest gross total which i have manged to get but my value of gross_total isn't in integer. how i can convert it into integer? and how to get that specific value for performing statistical functions.

import pandas as pd

dataset=pd.read_excel('movies.xls')

name=dataset['Title']
idmb=dataset['IMDB Score']

networth=dataset['Gross Earnings']

test_df=pd.DataFrame({'movie':name,
                  'rating':idmb,
                  'gross_total':networth})


 nds=test_df.dropna(axis=0,how='any')

 a=nds['gross_total'].astype(int)

 highest_rating =nds.loc[nds['rating'].idxmax()]

 highiest_networth=nds.loc[ nds['gross_total'].idxmax()]

 print(highest_rating)

 print(highiest_networth)

i get this output

  gross_total                  2.83415e+07
  movie          The Shawshank Redemption 
  rating                               9.3
  Name: 742, dtype: object

i have searched and came to know about the "pd.to_numeric" and "astype" functions but i couldnt understand how to use this in this sitution.

Muhammad Ahmed
  • 131
  • 1
  • 1
  • 7

4 Answers4

9

This worked for me, worth giving a try:

df['col_name'] = df['col_name'].astype('int64') 
Shalini Baranwal
  • 2,780
  • 4
  • 24
  • 34
5

You format your output accordingly:

n =  2.83415e+07

print(f'{n:f}')
print(f'{n:e}')

Output:

28341500.000000
2.834150e+07

See string format mini language

Pandas works the same:

import pandas as pd

df = pd.DataFrame ( [{"tata": 2.325568e9}])

# print with default float settings
print (df) 

pd.options.display.float_format = '{:,.4f}'.format  # set other global format
# print with changed float settings
print(df)

# really convert the type:
df["tata"] = df["tata"].astype(int)
# print with default int settings
print(df)

Credit to: unutbu's answer here

Output:

           tata
0  2.325568e+09          # before format change

                tata
0 2.325.568.000,0000     # after format change


         tata            # after int conversion
0 -2147483648

There are other ways to do formatting - see How to display pandas DataFrame of floats using a format string for columns?

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • 1
    Useful! My issue: I had a decimal in percentage form (call it n), and formatted it as `n.2`. Changing this to `n.2f` rounded it to 2 decimal places, while removing the scientific notation. Thank you! – Gustavo Louis G. Montańo Jan 21 '23 at 23:44
5

I had the same problem. Use

df['Tata'].map(int)
Zoe
  • 27,060
  • 21
  • 118
  • 148
Li Wang
  • 51
  • 1
  • 2
1
pd.set_option('display.float_format', '{:.2f}'.format)
df = pd.DataFrame({'Traded Value':[67867869890077.96,78973434444543.44],
              'Deals':[789797, 789878]})
print(df)
Traded Value Deals
0 67867869890077.96 789797
1 78973434444543.44 789878
elena.kim
  • 930
  • 4
  • 12
  • 22
ALPHA Q
  • 11
  • 1
  • 3