From a couple of other posts, a simple way to concatenate columns in a dataframe is to use the map command, as in the example below. The map function returns a series, so why can't just a regular series be used instead of map?
import pandas as pd
df = pd.DataFrame({'a':[1,2,3],'b':[4,5,6]},index=['m','n','o'])
df['x'] = df.a.map(str) + "_x"
a b x
m 1 4 1_x
n 2 5 2_x
o 3 6 3_x
This also works even though I'm specifically creating a series.
df['y'] = pd.Series(df.a.map(str)) + "_y"
a b x y
m 1 4 1_x 1_y
n 2 5 2_x 2_y
o 3 6 3_x 3_y
This doesn't work, it gives a TypeEror
df['z'] = df['a'] + "_z"
TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'str'
This doesn't work either:
df['z'] = pd.Series(df['a']) + "_z"
TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'str'
I checked to see if map returns a different type of object under the hood, but it doesn't seem to:
type(pd.Series(df.a.map(str)))
pandas.core.series.Series
type(pd.Series(df['a']))
pandas.core.series.Series
I'm confused about what map is doing that makes this work and how whatever map does carries over into the subsequent string arithmetic.