You can use the isna
and notna
Series
methods, which is concise and readable.
import pandas as pd
import numpy as np
df = pd.DataFrame({'value': [3, 4, 9, 10, 11, np.nan, 12]})
available = df.query("value.notna()")
print(available)
# value
# 0 3.0
# 1 4.0
# 2 9.0
# 3 10.0
# 4 11.0
# 6 12.0
not_available = df.query("value.isna()")
print(not_available)
# value
# 5 NaN
In case you have numexpr
installed, you need to pass engine="python"
to make it work with .query
.
numexpr
is recommended by pandas to speed up the performance of .query
on larger datasets.
available = df.query("value.notna()", engine="python")
print(available)
Alternatively, you can use the toplevel pd.isna
function, by referencing it as a local variable. Again, passing engine="python"
is required when numexpr
is present.
import pandas as pd
import numpy as np
df = pd.DataFrame({'value': [3, 4, 9, 10, 11, np.nan, 12]})
df.query("@pd.isna(value)")
# value
# 5 NaN