0
filtered = Series([True, False, True], index=df.index)

condition_loc = df.loc[df. LoanAmount.head() < 500]

boolean_i = df.iloc[[True , False ,  True ]]

boolean = df.loc[['True' , 'False' , 'True' ]].values

generates error

IndexError: Boolean index has wrong length: 3 instead of 614

KeyError: "None of [Index(['True', 'False', 'True'], dtype='object', name='Loan_ID')] are in the [index]"

IndexingError(
pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

raise ValueError(
ValueError: Length of values (3) does not match length of index (614)

Snapshot of data

    Loan_ID Gender Married  Dependents     Education Self_Employed  ApplicantIncome  CoapplicantIncome  LoanAmount  Loan_Amount_Term  Credit_History Property_Area Loan_Status
0  LP001002   Male      No           0      Graduate            No             5849                  0         100               360               1         Urban           Y
1  LP001003   Male     Yes           1      Graduate            No             4583               1508         128               360               1         Rural           N
2  LP001005   Male     Yes           0      Graduate           Yes             3000                  0          66               360               1         Urban           Y
3  LP001006   Male     Yes           0  Not Graduate            No             2583               2358         120               360               1         Urban           Y

Data is [614 rows x 12 columns] Intention is to generate given a list of boolean values select rows where value is true Have tried every available link that gets generated by any and every error mentioned above. Seems like no one has failed to generate values using above syntax. Please direct me to link where this can be resolved. Have tried to explain as much as possible. New to pandas. Thanks for your time!

Edit:

filtered = Series([True, False, True] )

removing index solved the first issue.

Edit 2:

df.loc[Series([True, False, True])]

gives

raise IndexingError(
pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

Suggested link only talks about series and not how to use it in conjunction with loc or iloc.

Edit 3:


import pandas as pd 
mydict = [

{"a": 1, "b": 2, "c": 3, "d": 4},

{"a": 100, "b": 200, "c": 300, "d": 400},

{"a": 1000, "b": 2000, "c": 3000, "d": 4000},
]

df = pd.DataFrame(mydict)

print(df)

print(df.iloc[[True, False, True]])

gives

a     b     c     d
0     1     2     3     4
1   100   200   300   400
2  1000  2000  3000  4000
      a     b     c     d
0     1     2     3     4
2  1000  2000  3000  4000

Works on above code where rows are equal to boolean but generates error when

print(df.iloc[[True, True]])

Edit 4:

condition_loc = list(filter(lambda x:x.head()>500,df.loc))

gives

KeyError: 0
The above exception was the direct cause of the following exception:

    raise KeyError(key) from errKeyError: 0

Edit 5:

boolean = list(compress(loan_df, list1)) 
print(boolean )

prints column names!

Edit 6:

list1 = [True , False ,  True ]
    
boolean = list(compress(df, list1)) 
    for i in boolean :
        print(df.loc[boolean]) 

gives

raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['Gender', 'Dependents'], dtype='object', name='Loan_ID')] are in the [index]"

Edit 7: iloc issue resolved

all_rows_df = list(range(0, len(df))) # gives integer values
    boolean = list(compress(all_rows_df, list1)) # selects values by comparison
    print(boolean)
    for i in boolean :
        print(i)
        print(df.iloc[i]) # Index position of rows in integer or list of integer
        

gives

[0, 2]
Gender                   Male
Married                    No
Dependents                  0
Education            Graduate
Self_Employed              No
ApplicantIncome          5849
CoapplicantIncome         0.0
LoanAmount                NaN
Loan_Amount_Term        360.0
Credit_History            1.0
Property_Area           Urban
Loan_Status                 Y
Name: LP001002, dtype: object
Gender                   Male
Married                   Yes
Dependents                  0
Education            Graduate
Self_Employed             Yes
ApplicantIncome          3000
CoapplicantIncome         0.0
LoanAmount               66.0
Loan_Amount_Term        360.0
Credit_History            1.0
Property_Area           Urban
Loan_Status                 Y
Name: LP001005, dtype: object


But the above method gives error on loc

[0, 2]
0
KeyError: 0

The above exception was the direct cause of the following exception:

    return self._getitem_axis(maybe_callable, axis=axis)
  
    return self._get_label(key, axis=axis)
  
    return self.obj.xs(label, axis=axis)
  
    loc = index.get_loc(key)
  
    raise KeyError(key) from errKeyError: 0

Currently I am stuck on this

Subham
  • 397
  • 1
  • 6
  • 14
  • 1
    This: `filtered = Series([True, False, True], index=df.index)`. Your `df` has 614 rows. How can it map to the 3 booleans in the `Series` you are creating? – Code Different Aug 07 '22 at 15:51
  • 2
    Does this answer your question? [ValueError: Length of values does not match length of index | Pandas DataFrame.unique()](https://stackoverflow.com/questions/42382263/valueerror-length-of-values-does-not-match-length-of-index-pandas-dataframe-u) – Ynjxsjmh Aug 07 '22 at 15:53
  • How do i use loc and iloc with series? – Subham Aug 07 '22 at 16:13
  • Why are you trying to compare just the first 5 rows to the whole dataframe? `df.loc[df. LoanAmount.head() < 500]`? – BeRT2me Aug 07 '22 at 19:18
  • `Intention is to generate given a list of boolean values select rows where value is true` Have you tried using a list of booleans which is the same length as the number of rows in your dataset? – Nick ODell Aug 07 '22 at 19:33
  • I needed reduced output hence head() . To generate a list of 614 boolean is Cumbersome! Works on very small data. – Subham Aug 07 '22 at 22:37

1 Answers1

1

You need to create your own function to first convert into string and then split and print back on screen.

loan_amt = str(loan_df.LoanAmount.head())