How to only show the rows where data of one variable matches column data of another variable

Question

So I want to only show the rows in which the x and y value of the other id's matches the x and y of id 0. For example, show id 0 and id 250017920 (row 9) as the x and y match out of the first 20 rows. This process would need to be repeated for all rows so that all we have left is the rows where the x and y match that of id 0 as its x and y changes.

d={'ID':[0,2398794,3987694,987957, 9875987, 76438739, 2474654, 1983209, 2874050, 250017920, 38764902],
    'x':[-46,8769,432, 426, 132, 93, 124, 475, 857, -46, 67],
    'y':[2562,987, 987, 252, 234, 123, 765, 1452, 542, 2562, 5876],
    'z':[5, 7, 6, 2, 7, 7 ,4 , 5 , 1, 9,3]}
data=pd.DataFrame(data=d)

           ID     x     y  z
0           0   -46  2562  5
1     2398794  8769   987  7
2     3987694   432   987  6
3      987957   426   252  2
4     9875987   132   234  7
5    76438739    93   123  7
6     2474654   124   765  4
7     1983209   475  1452  5
8     2874050   857   542  1
9   250017920   -46  2562  9
10   38764902    67  5876  3

https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples Please create a small reproducible dataset. — Scott Boston, Jun 23 '20 at 15:46

score 0 · Answer 1 · answered Jun 24 '20 at 04:28

For the following dataframe

d={'ID':[999,2398794,3987694,987957, 9875987],
    'x':[132,8769,432, 132, 132],
    'y':[563,987, 987, 563, 234],
    'z':[5, 7, 6, 2, 7]}
data=pd.DataFrame(data=d)

print(data)
        ID     x    y  z
0      999   132  563  5
1  2398794  8769  987  7
2  3987694   432  987  6
3   987957   132  563  2
4  9875987   132  234  7

get the index of the ID value for which you want to match values in column x and column y. Here, lets say for ID value 999.

#[0] here refers to the first time this ID appeared in the dataframe, if in any case same ID had appeared multiple times
ind=data.index[data.ID == 999][0]

Now get the rows where x and y values matches x and y values for ID = 999.

data[(data['x']==data['x'][ind]) & (data['y']==data['y'][ind])]

    ID       x    y     z
0   999     132  563    5
3   987957  132  563    2

How to only show the rows where data of one variable matches column data of another variable

1 Answers1