Suppose I have a CSV, named a.csv
that looks like this:
foo,bar
aaa,"1234"
bbb,"5678"
I'd like to read in the bar
column as strings, as indicated by the double-quotes.
However, when I run the following:
d = pd.read_csv("a.csv")
print(d.dtypes)
It returns:
foo object
bar int64
dtype: object
I've tried various combinations of parameters: quoting
, quotechar
, but can't seem to get bar
recognized as strings (i.e. having object
rather than int64
as its dtype
).
Is it possible to achieve this without explicitly specifying the column type via the dtype
parameter?
EDIT: I don't believe the referenced question answers my question. That reference describes how strings are stored as objects, but it doesn't explicitly answer why a quoted column, i.e. bar
, is NOT casted to strings. Nor does it answer how to get a string column out of bar
.
EDIT 2: I should clarify that perhaps I don't care to return a str
type for the bar
column but rather I DO NOT want it to return an int64
type for bar
. It is a bit odd to me that despite there being explicit quotes around foo column entries, Pandas chooses to ignore the quotes and select int64
for bar
's type.