1

I have the following dataframe :

import pandas as pd

numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10]}
df = pd.DataFrame(numbers,columns=['set_of_numbers'])

df['equal_or_lower_than_4?'] = df['set_of_numbers'].apply(lambda x: 'True' if x <= 4 else 'False')

print (df)

  set_of_numbers equal_or_lower_than_4?
0               1                   True
1               2                   True
2               3                   True
3               4                   True
4               5                  False
5               6                  False
6               7                  False
7               8                  False
8               9                  False
9              10                  False

When I try to apply the all() function on the last column it returns True although some values are False

all(df['equal_or_lower_than_4?'])

#Out[29]:

#True
Andreas
  • 8,694
  • 3
  • 14
  • 38
Asser
  • 9
  • 1

1 Answers1

0
  1. you can simplify the code, by using: df['set_of_numbers'] <= 4

.

import pandas as pd

numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10]}
df = pd.DataFrame(numbers,columns=['set_of_numbers'])

df['equal_or_lower_than_4?'] = df['set_of_numbers'] <= 4


Out[69]: 
0     True
1     True
2     True
3     True
4    False
5    False
6    False
7    False
8    False
9    False
Name: equal_or_lower_than_4?, dtype: bool
  1. Why does all(df['equal_or_lower_than_4?']) return True? The reason is that all() is defined as:

The all() function returns True if all items in an iterable are true, otherwise it returns False.

If the iterable object is empty, the all() function also returns True.

A pandas.Series equals True if it is not empty. So all() simply checks: all([True]) because there is only one non-empty pd.Series provieded as a parameter. To check if each element inside that is True you can either use np.sum() or all(df['equal_or_lower_than_4?'].tolist())

import numpy as np
print(np.all(df['equal_or_lower_than_4?']))
#False
Andreas
  • 8,694
  • 3
  • 14
  • 38