Merging data in csv-file using Pandas

Asked May 15 '19 at 19:28

Active May 15 '19 at 19:28

Viewed 40 times

I got a csv-file containing names and income. Some names are shown multiple times. I wanted to merge these to get only 1 unique name each with the income next to it using Pandas.

I thought pivot would be the solution for my problem. I tried the following:

df = pd.read_csv("properties.csv")
df = df.iloc[1:]
df = pd.DataFrame(df, columns= ['income', 'names'])
df['source'] = df['income'].astype(int)

test = pd.pivot_table(df, index='names', values='income')

What the problem is that I would like to numbers itself rather than the average.

For example:

name1: 2,3,2,3

name2: 1,2,4,1

Instead of:

name1: 2.5

name2: 2

asked May 15 '19 at 19:28

Hiach

1

Looks like duplicate to https://stackoverflow.com/questions/22219004/grouping-rows-in-list-in-pandas-groupby – 9dogs May 15 '19 at 19:32
default `aggfunc` of pivot_table is numpy.mean, which is why you're getting the average. – Snehaa Ganesan May 15 '19 at 19:34
@9dogs In my case I have over 500 names. So applying the solutions described in that post does not seem to be possible. But I could be missing something? – Hiach May 15 '19 at 20:03
did you try? you would pass it to to the `aggfunc` parameter: `pd.pivot_table(df, index='names', values='income', aggfunc=list)` – dan_g May 15 '19 at 20:13
@dan_g Thank you for your solution. I wasn't aware of the aggfunc possibilities besides average and sum. – Hiach May 15 '19 at 20:19

Merging data in csv-file using Pandas

0 Answers0