I am trying to concat multiple Pandas DataFrame columns with different tokens.
For example, my dataset looks like this :
dataframe = pd.DataFrame({'col_1' : ['aaa','bbb','ccc','ddd'],
'col_2' : ['name_aaa','name_bbb','name_ccc','name_ddd'],
'col_3' : ['job_aaa','job_bbb','job_ccc','job_ddd']})
I want to output something like this:
features
0 aaa <0> name_aaa <1> job_aaa
1 bbb <0> name_bbb <1> job_bbb
2 ccc <0> name_ccc <1> job_ccc
3 ddd <0> name_ddd <1> job_ddd
Explanation :
concat each column with "<{}>" where {} will be increasing numbers.
What I've tried so far:
I don't want to modify original DataFrame so I created two new dataframe:
features_df = pd.DataFrame()
final_df = pd.DataFrame()
for iters in range(len(dataframe.columns)):
features_df[dataframe.columns[iters]] = dataframe[dataframe.columns[iters]] + ' ' + "<{}>".format(iters)
final_df['features'] = features_df[features_df.columns].agg(' '.join, axis=1)
There is an issue I am facing, It's adding <2> at last but I want output like above, also this is not panda's way to do this task, How I can make it more efficient?