I have an excel file along the lines of
gdp gdp (2009)
1929 104.6 1056.7
1930 173.6 962.0
1931 72.3 846.6
I want to read in the file and specify that the first column (which as no header information) is an integer. I don't need column B
I am reading in the file using the following
import pandas as pd
from pandas import ExcelFile
gdp = pd.read_excel('gdpfile.xls, skiprows = 2, parse_cols = "A,C")
This reads in fine, except the years all get turned into floats, e.g. 1929.0, 1930.0, 1931.0. The first two rows are NaN.
I want to specify that it should be integer. I have tried adding converters = {"A":int,"C":float}
in the read_excel
command, as suggested by Python pandas: how to specify data types when reading an Excel file? but this did not fix things.
I have tried to convert after the fact, which I've previously done to convert strings to float, however this also did not work.
gdp.columns = ['Year','GDP 2009']
gdp['Year'] = gdp['Year'].astype(int)
I also tried using dtypes = int
as suggested in one of the comments at the above link, however this also does not work.
Note that the skiprows
is necessary as my actual excel file has a few rows at the top I do not want.