0

I got the titanic data from kaggle, upload to google spreadsheet and read it from colab. And found out that Age Dtype got object because of missing value(or other reason). How could I change Age Dtype to float64?

from google.colab import auth
import pandas as pd
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

worksheet = gc.open('titanic_train').sheet1

# get_all_values gives a list of rows.
datas = worksheet.get_all_records()
print(datas)

pd.DataFrame(datas).info()

I got the info below

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          891 non-null    object 
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        891 non-null    object 
 11  Embarked     891 non-null    object 
dtypes: float64(1), int64(5), object(6)
memory usage: 83.7+ KB
bluesky
  • 55
  • 1
  • 7
  • Does this answer your question? [Pandas: convert dtype 'object' to int](https://stackoverflow.com/questions/39173813/pandas-convert-dtype-object-to-int) – Jimmar Sep 24 '20 at 08:00

1 Answers1

0

You'll want to convert the Age column to an integer data type. This can be done as follows:

df = pd.DataFrame(datas)

df['Age'] = pd.to_numeric(df['Age'])
mullinscr
  • 1,668
  • 1
  • 6
  • 14