-1

I have excel sheets in a workbook that have multiple columns. Those columns hold different headings. Some of them might hold the same data. Those reports are generated using pandas. How to compare all columns on each sheet and if data on any of these columns are the same delete. Headings are

2014  2015  2016 2017  2018
12.    14.  12.   15.   20
11.    11.  11.   12.   21 

You can see 2014 and 2016 hold the same data. How to delete 2016 if it matched 2014? I have multiple sheets with multiple years.

Mr. T
  • 11,960
  • 10
  • 32
  • 54
Mazin
  • 81
  • 1
  • 2
  • 6
  • thanks this is working however, i forget to say each column is 2014 then under that the indicator name then the data. the T delete will work if the column name is only 2014. any help would be appreciated mazin – Mazin Feb 24 '18 at 02:19
  • 1
    @Mazin Please edit the question itself, rather than make comments about what you should or should not have written. – chb Feb 24 '18 at 02:47

1 Answers1

1

Here you go:

   import pandas as pd
   import numpy as np 

   data = {'2012': ['1', '2', '3', '4', '5'], '2013': ['2', '2', '2', '2', 
           '2'], '2014': ['1', '2', '3', '4', '5']}
   df = pd.DataFrame(data, columns=['2012', '2013', '2014'])

   results = df.T.drop_duplicates().T
Stephen Strosko
  • 597
  • 1
  • 5
  • 18
  • Ind1 Ind1 Ind1 Ind1 Ind2 Ind2 Ind2 2014 2015 2016 2017 2014 2015 2016 dhb1 2 3 2 3 2 3 2 dhb2 2 3 2 3 2 3 2 I need to delete data under Ind1 year 2016 and year 2017 and Ind2 year 2016 because they are equal to another column under the same indicator. thank you – Mazin Feb 24 '18 at 02:29