2

I am new to Python Programming, struggling with pandas data feature. I have pandas dataframe set like below

         col1   col2   col3   col4           
12:00     1      1       3      2
12:05     1      1       3      2
12:10     1      2       4      2
12:15     1      2       4      2
12:20     1      2       4      2
12:25     2      3       7      8 
12:30     2      3       7      8
12:35     2      3       7      8
12:40     2      3       7      8
12:45     4      5       4      3 

What I want to do is to extract(or erase) rows that when data of each column changes

(row 1,2) (row 3,4,5) (row 6,7,8,9) (row 10) have same values with different time so that result would be like below. Time data cannot be ignored.

         col1   col2   col3   col4           
12:00     1      1       3      2
12:10     1      2       4      2
12:25     2      3       7      8 
12:45     4      5       4      3 

If there is any feature or function , it would be great help. Thank you.

  • You can refer: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop_duplicates.html – mpx Mar 05 '21 at 07:02
  • 2
    Does this answer your question? [Drop all duplicate rows across multiple columns in Python Pandas](https://stackoverflow.com/questions/23667369/drop-all-duplicate-rows-across-multiple-columns-in-python-pandas) – mpx Mar 05 '21 at 07:04

1 Answers1

3

Here is the code which will work on your query.

df = pd.DataFrame({
    'Time': ['12:00','12:05','12:10','12:15','12:20','12:25','12:30','12:35','12:40','12:45'],
    'Col1': ['1','1','1','1','1', '2', '2', '2', '2','4'],
    'Col2': ['1','1', '2', '2', '2', '3','3','3','3','5'],
    'Col3': ['3','3', '4', '4','4','7','7','7','7', '4'],
    'Col4': ['2', '2', '2', '2','2', '8', '8', '8','8','3']
})
print(df)
print(df.drop_duplicates(subset=['Col2']))