Iterate through a pandas dataframe, check each column value with an if statement and pass the column values to the prefered column of an empty df

Question

I have pandas a dataframe like this

    Event_Id       Investigation_Type Accident_Number   Event_Date      
0   20180922X71035  ACCIDENT          DCA18CA289        09/10/2018  
1   20180507X00658  ACCIDENT          DCA18CA169        05/07/2018  
4   20171212X50255  ACCIDENT          DCA18CA043B       12/03/2017

and I try to iterate through it like this...

n1col = 0
n2col = 1

for i in df.index:
    Node1=df.Event_Id
    for j in df.index:
        Node2=df.Event_Id
        if (Node1 != Node2):
            new_df.loc[j,n1col] = Node1
            new_df.loc[j,n2col] = Node2

I don't know if my approach is right (as I see it's not) I want some help so I can get a result like the one below...

I am new to that kind of stuff so I need your help.

    Node_1          Node_2 
0   20180922X71035  20180507X00658
1   20180922X71035  20171212X50255
2   20180507X00658  20180922X71035
3   20180507X00658  20171212X50255
4   20171212X50255  20180922X71035
6   20171212X50255  20180507X00658

Thanks in Advance.

Orenshi · Answer 1 · 2019-01-08T15:48:06.623

I know you've already accepted an answer. But in case you are not looking for a combination and instead want the cartesian product and then filter it down such that the 2 columns are not equal...

>>> df
         Event_Id Accident_Number  Event_Date Investigation_Type
0  20180922X71035      DCA18CA289  09/10/2018           ACCIDENT
1  20180507X00658      DCA18CA169  05/07/2018           ACCIDENT
2  20171212X50255     DCA18CA043B  12/03/2017           ACCIDENT

Get the Cartesian Product Answer from this other StackOverflow post

>>> df['key'] = 0
>>> df
         Event_Id Accident_Number  Event_Date Investigation_Type  key
0  20180922X71035      DCA18CA289  09/10/2018           ACCIDENT    0
1  20180507X00658      DCA18CA169  05/07/2018           ACCIDENT    0
2  20171212X50255     DCA18CA043B  12/03/2017           ACCIDENT    0
>>> df2 = df.merge(df, on='key').filter(items=['Event_Id_x', 'Event_Id_y'])
>>> df2
       Event_Id_x      Event_Id_y
0  20180922X71035  20180922X71035
1  20180922X71035  20180507X00658
2  20180922X71035  20171212X50255
3  20180507X00658  20180922X71035
4  20180507X00658  20180507X00658
5  20180507X00658  20171212X50255
6  20171212X50255  20180922X71035
7  20171212X50255  20180507X00658
8  20171212X50255  20171212X50255

Use .loc/boolean indexing to filter down your DataFrame

>>> df2.loc[df2['Event_Id_x'] != df2['Event_Id_y']]
       Event_Id_x      Event_Id_y
1  20180922X71035  20180507X00658
2  20180922X71035  20171212X50255
3  20180507X00658  20180922X71035
5  20180507X00658  20171212X50255
6  20171212X50255  20180922X71035
7  20171212X50255  20180507X00658

Similar to Josh's answer using itertools. But this time using product instead of combination:

>>> df = df.set_index('Event_Id')
>>> df3 = pd.DataFrame(list(product(df.index.tolist(), df.index.tolist())), columns=['Node1', 'Node2'])
>>> df3.loc[df3['Node1'] != df3['Node2']]
            Node1           Node2
1  20180922X71035  20180507X00658
2  20180922X71035  20171212X50255
3  20180507X00658  20180922X71035
5  20180507X00658  20171212X50255
6  20171212X50255  20180922X71035
7  20171212X50255  20180507X00658

@Orenshi, this great show, however second solution doesn't yield the shown results. — Karn Kumar, Jan 08 '19 at 15:45
Good point. I forgot to add setting the index to the second solution. Or alternatively instead of df.index.tolist() you can use df.Event_Id.tolist() — Orenshi, Jan 08 '19 at 15:47
@Orenshi I will keep that solution in my mind for future problems. — runnerhigh, Jan 08 '19 at 16:28

score 1 · Accepted Answer · answered Jan 08 '19 at 14:17

1

You can do it in one line with

from itertools import combinations

pd.DataFrame(list(combinations(df.index.tolist(), 2)), columns=['Node1', 'Node2'])

answered Jan 08 '19 at 14:17

Josh Friedlander

10,870
5
35
75

Thanks a lot, you saved my day!! – runnerhigh Jan 08 '19 at 14:29
You're very welcome. If you're new to Pandas you'll discover that you can do an enormous amount without using loops. – Josh Friedlander Jan 08 '19 at 14:37
In your original post, aren't you asking for a cartesian product and excluding the records where they're equal? I didn't think you were looking for combination – Orenshi Jan 08 '19 at 14:40
@Orenshi Is there any difference if it is a cartesian product or a combination? My goal is to create a network with these ids and visualize it(if this helps). – runnerhigh Jan 08 '19 at 16:25

Iterate through a pandas dataframe, check each column value with an if statement and pass the column values to the prefered column of an empty df

2 Answers2