0

I am newbie in python, I have a huge dataframe:

Person  OD
A       BS1
A       BS2
B       BS4
B       BS8
C       BS5
C       BS1
D       BS9
D       BS7
E       BS2
E       BS7
F       BS2
F       BS1
G       BS1
G       BS2

is it possible to transform into an origin-destination (OD) matrix in python-pandas? Example from BS1 to BS2 there is 2 person (A and G) then in OD matrix 2 people into BS1-BS2.

my expected result:

O/D BS1 BS2 BS3 BS4 BS5 BS6 BS7 BS8 BS9
BS1     2                           
BS2 1                       1       
BS3                                 
BS4                             1   
BS5 1                               
BS6                                 
BS7                                 
BS8                                 
BS9                         1   

how to do it? thanks a lot

Arief Hidayat
  • 937
  • 1
  • 8
  • 19
  • Do the input data always have two consecutive rows for each person and the first is the origin? – GZ0 Jun 10 '19 at 04:49
  • yes, it does, and the data always pair.. the first line is the origin and the second line is the destination – Arief Hidayat Jun 10 '19 at 04:51

1 Answers1

2

Following is a solution.

places = df["OD"].unique()
places.sort()
od_df = pd.DataFrame(df["OD"].values.reshape((-1, 2)), columns=["O", "D"])
od_matrix = od_df.groupby(["O", "D"]).size().unstack().reindex(index=places, columns=places)
od_matrix.fillna(0, downcast="infer", inplace=True)

You can also use pd.pivot_table and replace the fourth line with

od_matrix = pd.pivot_table(od_df, index="O", columns="D", aggfunc="size").reindex(index=places, columns=places)
GZ0
  • 4,055
  • 1
  • 10
  • 21