Isolating Rows Of A Dataframe in a loop based on multiple conditions

Question

So I asked a question related to this recently and while the answer wassimple then ( I failed to utilize a specific column) this time I don't have that column. Here is the OP. None of the extra answers provided there actually work either :/

The problem is with a multilabel data frame when you want to isolate rows that contain 1 for a given class and zero for others. So far here is the code I have but it loops into infinity and crashes colab.

In this case I want just that Action row but Im also trying to loop it so I will append all Action with value 1 and column_list with value 0 next History 1 all others 0 etc...

Again the options provided on the link give me a The truth of the answer is ambiguous error

Index |  Drama | Western | Action | History |
   0        1        1         0         0
   1        0        0         0         1
   2        0        0         1         0


# Column list to be popped
column_list = list(balanced_df.columns)[1:]

single_labels = []
i=0

# 28 columns total
while i < 27:
  # defining/reseting the full column list at the start of each loop
  column_list = list(balanced_df.iloc[:,1:])
  # Pop column name at index i
  x = column_list.pop(i)

  # storing the results in a list of lists
  # Filters for the popped column where the column is 1 & the remaining columns are set to 0
  single_labels.append(balanced_df[(balanced_df[x] == 1) & (balanced_df[column_list]==0)])

  # incriment the column index number for the next run
  i+=1

The output here would be something like

single_labels[0]

    Index |  Drama | Western | Action | History |
       2        0        0         1         0


single_labels[1]
    Index |  Drama | Western | Action | History |
       1        0        0         0         1

From the comments in the other question, `df.loc[df['Western'].eq(1) & df.sum(axis='columns').eq(1)]` should do it — Paul H, Apr 02 '21 at 19:59
Sorry its not clear. The result would be list of lists containing rows of the df where the Action column in the rows of list index 0 would have all 1's and other columns all 0's then list index 1 would have History with all 1's and all other columns 0 etc... — Digital Moniker, Apr 02 '21 at 20:00
type out the dataframe you want to see and put in the question — Paul H, Apr 02 '21 at 20:01
Okay that solution worked too, do you want to post it and Ill accept it. Thanks — Digital Moniker, Apr 02 '21 at 20:02

Paul H · Accepted Answer · 2021-04-02T20:15:41.983

1

You don't need a loop. You rarely need loops with pandas. If you're selecting rows based on conditions, you should use boolean indexing.

In your case, that's:

df.loc[df.sum(axis='columns').eq(1)]

As an example:

pandas.DataFrame({
    'A': [1, 0, 0, 0, 0, 1, 1, 0, 0],
    'B': [0, 1, 0, 0, 1, 0, 1, 0, 0],
    'C': [0, 0, 1, 0, 1, 0, 0, 1, 0],
    'D': [0, 0, 0, 1, 0, 1, 0, 1, 0],
}).loc[lambda df: df.sum(axis='columns').eq(1)].values.tolist()

Which outputs:

[[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]]

edited Apr 02 '21 at 20:15

answered Apr 02 '21 at 20:03

Paul H

65,268
20
159
136

I am trying to build a list of lists that contain only the rows that are not multilabel don't I need to loop through each column for this? Your original answer is correct but I need add a variable where I see Western so that the next time it loops it captures History and so on... – Digital Moniker Apr 02 '21 at 20:08
@DigitalMoniker nope. No loops. You can add `.values.tolist()` to the end of the command above if you want. – Paul H Apr 02 '21 at 20:11
Wow I just plotted that with seaborn now and I see there's no multilabels... that's amazing. It's because of.eq? Im not familiar with it but Ill check it out. Thanks! – Digital Moniker Apr 02 '21 at 20:17
@DigitalMoniker if you're using seaborn, you don't need a list of lists. – Paul H Apr 02 '21 at 20:18
I just used seaborn to investigate the dataframe that was returned after your original answer. I was mentioning a list of lists because I had used that technique before to capture filtered df results based on multiple different conditions is all – Digital Moniker Apr 02 '21 at 20:22
My guess is that you're better off leaving this as a dataframe, but it's hard to know – Paul H Apr 02 '21 at 20:31
100% and your answer is perfect. The list of lists was just a result of my amateur logical thought process for data manipulation. Thanks a lot for the answer. – Digital Moniker Apr 02 '21 at 20:35

Isolating Rows Of A Dataframe in a loop based on multiple conditions

1 Answers1