0

Suppose I have the following data frame:

      F0  F1  F2  F3  F4  F5  F6  F7  F8  F9  ...    F1025  F1026  F1027  \
5005   7   7   7   7   7   7   7   7   7   7  ...        7      7      7   
5006   7   7   7   7   7   7   7   7   7   7  ...        7      7      7   
5010   7   7   7   7   7   7   7   7   7   7  ...        7      7      7   
5013   7   7   7   7   7   7   7   7   7   7  ...        7      7      7   
5016   6   6   6   6   6   6   6   6   6   6  ...        0      0      0   
5017   7   7   7   7   7   7   7   7   7   7  ...        7      7      7   
5019   7   7   7   7   7   7   7   7   7   7  ...        7      7      7   
5021   5   5   5   5   5   5   5   5   5   5  ...        0      0      0   
5102   7   7   7   7   7   7   7   7   7   7  ...        1      1      1   
5103   7   7   7   7   6   7   7   7   7   7  ...        7      7      7   
5104   7   7   7   7   7   7   7   7   7   7  ...        0      0      0   
5302   6   6   6   6   6   6   6   6   6   6  ...        0      0      0   
5409   6   6   6   6   6   6   6   6   6   6  ...        2      2      2   
5422   0   0   0   0   0   0   0   0   0   0  ...        0      0      0   
5601   0   0   0   0   0   0   0   0   0   0  ...        0      0      0   
5603   7   7   7   7   7   7   7   7   7   7  ...        7      7      7  

Is there a way in Python that I can easily find the largest subset of features and indices that have 7's everywhere in the middle?

I realize this might be a greedy algorithm where I first pick all features, or I first pick all indices? I'm not sure the best way to tackle it.

user1357015
  • 11,168
  • 22
  • 66
  • 111
  • 2
    Can you please be more specific? What is your expected output for this given data? – cs95 Feb 07 '18 at 06:05
  • @COLDSPEED: i'm not sure what the solutions should be -- I'm not sure it's unique. I want a subtable of that dataframe (So cutting both rows and columns) so that the values in the table are all 7's. – user1357015 Feb 07 '18 at 06:07
  • 2
    You might want to dumb down the posted data to something that can easily reproduce what you have in mind :) – cs95 Feb 07 '18 at 06:08
  • "7's everywhere in the middle" is not specific enough. Are non-7's on the edges OK? Do you want a rectangle or a square of 7's or entire rows? Can the rows be rearranged to make a larger group? The answer is certainly not unique - consider a field with just two 7s well separated. – verisimilidude Feb 07 '18 at 06:17
  • @verisimilidude: By 7's in the middle, I mean that every entry in whatever table is left becomes a 7. So put another way, what combination of rows and columns are dropped so all that is left is a table of 7's. – user1357015 Feb 07 '18 at 06:29
  • This appears to be an NP-hard problem. – cs95 Feb 07 '18 at 06:33
  • This can easily be reduced to [this](https://stackoverflow.com/questions/3806520/finding-maximum-size-sub-matrix-of-all-1s-in-a-matrix-having-1s-and-0s). – ayhan Feb 07 '18 at 06:42
  • 1
    Also see https://www.geeksforgeeks.org/maximum-size-sub-matrix-with-all-1s-in-a-binary-matrix/ – ayhan Feb 07 '18 at 06:45

0 Answers0