I have a df that has a column whose values are either: np.nan or a variable length list of strings.
Simply put, what I want is exactly the same as the accepted answer here (from @Emre
):
https://datascience.stackexchange.com/questions/11797/split-a-list-of-values-into-columns-of-a-dataframe
The problem I have is the np.nan values in my column, which are absent in the accepted answer above.
When I run the code I get this error:
Traceback (most recent call last):
File "C:/Users/Mark/PycharmProjects/main/main.py", line 76, in <module>
for i in frozenset.union(*fcc['JobRoleInterest']):
TypeError: descriptor 'union' for 'frozenset' objects doesn't apply to a 'float' object
So I changed all of the np.nan values to None, but now I get this:
Traceback (most recent call last):
File "C:/Users/Mark/PycharmProjects/main/main.py", line 76, in <module>
for i in frozenset.union(*fcc['JobRoleInterest']):
TypeError: descriptor 'union' for 'frozenset' objects doesn't apply to a 'NoneType' object
Here is the code section I am working on:
# https://stackoverflow.com/questions/14162723/replacing-pandas-or-numpy-nan-with-a-none-to-use-with-mysqldb/54403705
# fcc = fcc.where(pd.notnull(fcc), None) # Entire df of np.nan replaced with None
fcc['JobRoleInterest'] = fcc['JobRoleInterest'].where(pd.notnull(fcc['JobRoleInterest']), None)
# fcc['JobRoleInterest'] = None if fcc['JobRoleInterest'] == np.nan else fcc['JobRoleInterest']
for i in frozenset.union(*fcc['JobRoleInterest']):
fcc[i] = fcc.apply(lambda _: int(i in _.i), axis=1)