I have a series s
that has entries that are lists, for example [1, 2, 3, NaN, NaN]
or [4, 5]
. These lists may contain NaNs as the last few elements, and I want to drop all entires in this series that contain NaN. I have so far used s.transform(lambda x: np.nan if np.isnan(x).any() else x).dropna()
, but this takes over a minute on just 21 million rows, and I am eventually planning on doing this with tens of billions of rows, so I need something fast. Thank you!
To emphasize, each entry in the series is a list, and so I cannot just use pd.dropna()
because there are no entries that are NaN since are lists themselves. I want to delete the lists (entries) that CONTAIN NaN. This is what the series s
might look like: pd.Series([1, 2, 3, NaN, NaN], [4, 5]...)
.