0

I have a dataframe as follow:

import pandas as pd
d = {'location1': [1, 2,3,8,6], 'location2': 
[2,1,4,6,8]}
df = pd.DataFrame(data=d)

The dataframe df means there is a road between two locations. look like:

   location1    location2
0   1               2 
1   2               1
2   3               4 
3   8               6 
4   6               8

The first row means there is a road between locationID1 and locationID2, however, the second row also encodes this information. The forth and fifth rows also have repeated information. I am trying the remove those repeated by keeping only one row. Any of row is okay.

For example, my expected output is

   location1    location2
0   1               2 
2   3               4 
4   6               8

Any efficient way to do that because I have a large dataframe with lots of repeated rows.

Thanks a lot,

jason
  • 1,998
  • 3
  • 22
  • 42
  • Possible duplicate of [Is there an efficient way to select multiple rows in a large pandas data frame?](https://stackoverflow.com/questions/55439420/is-there-an-efficient-way-to-select-multiple-rows-in-a-large-pandas-data-frame) – Jondiedoop Apr 23 '19 at 15:03

1 Answers1

1

It looks like you want every other row in your dataframe. This should work.

import pandas as pd
d = {'location1': [1, 2,3,8,6], 'location2': 
[2,1,4,6,8]}
df = pd.DataFrame(data=d)

print(df)

   location1  location2
0          1          2
1          2          1
2          3          4
3          8          6
4          6          8

def Every_other_row(a):
    return a[::2]

Every_other_row(df)

   location1  location2
0          1          2
2          3          4
4          6          8
Angel Roman
  • 598
  • 1
  • 3
  • 13