I am struggling to understand why am I getting \n1
and \n2
outputs after converting int64
to str
.
From year 2014 to 0 2014\n1 2015\n2 2015\n3 2016
.
This is what I did: df.Year = str(df.Year)
Before:
After:
Close-up:
When you’re using str(do.Year)
you get a string representation of all the rows, not a Series with each row as a string.
To convert the rows to strings, you need to use df.Year = df.Year.astype(str)
This might help to perform what was intended (convert a column to a specific type) Change column type in pandas. As regards to why this is happening, lets take an example
table.csv
Year
2010
2011
2012
lets consider this snippet
import pandas as pd
df = pd.read_csv("table.csv")
print(str(df.Year))
The output produced is
0 2010
1 2011
2 2012
Name: Year, dtype: int64
As this link clearly mentions, attribute access against a dataframe returns a series object. And when you call str against a series, it basically calls Series.str (Does python `str()` function call `__str__()` function of a class?) and therefore after doing the operation mentioned in the question, you store the result of Series.str in every row of the column Year. As mentioned in the beginning of this post, you can refer to the SO link, it is pretty self explanatory. However, if you want a code sample, please comment.
Code Snippet
Here is one of the ways to convert a column to integer type(inspired from the SO link shared above)
import pandas as pd
df = pd.read_csv("table.csv")
df.Year = df.Year.astype(str)
print(type(df.Year[0])) #Prints <class 'str'>
print(df)