This question pertains to the fine solution to my previous question, Create Multiple New Columns Based on Pipe-Delimited Column in Pandas
I have a pipe delimited column that I want to convert to multiple new columns which count the occurrence of elements in each row's pipe-string. I've been given a solution that works except for rows with empty cells in the pertinent column, where it leaves NaN/blanks instead of 0s. Besides a posteriori NaN->0 conversion, is there a way to augment the current solution?
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.array([
[1202, 2007, 99.34,None],
[9321, 2009, 61.21,'12|34'],
[3832, 2012, 12.32,'12|12|34'],
[1723, 2017, 873.74,'28|13|51']]),
columns=['ID', 'YEAR', 'AMT','PARTS'])
part_dummies = df1.PARTS.str.get_dummies().add_prefix('Part_')
print(pd.concat([df1, part_dummies], axis=1, join_axes=[df1.index]))
# Expected Output:
# ID YEAR AMT PART_12 PART_34 PART_28 PART_13 PART_51
# 1202 2007 99.34 0 0 0 0 0
# 9321 2009 61.21 1 1 0 0 0
# 3832 2012 12.32 2 1 0 0 0
# 1723 2017 873.74 0 0 1 1 1
# Actual Output:
# ID YEAR AMT PART_12 PART_34 PART_28 PART_13 PART_51
# 1202 2007 99.34 0 0 0 0 0
# 9321 2009 61.21 1 1 0 0 0
# 3832 2012 12.32 1 1 0 0 0
# 1723 2017 873.74 0 0 1 1 1
part_dummies = pd.get_dummies(df1.PARTS.str.split('|',expand=True).stack()).sum(level=0).add_prefix('Part_')
print(pd.concat([df1, part_dummies], axis=1, join_axes=[df1.index]))
# ID YEAR AMT PART_12 PART_13 PART_28 PART_34 PART_51
# 1202 2007 99.34 NaN NaN NaN NaN NaN
# 9321 2009 61.21 1 0 0 1 0
# 3832 2012 12.32 2 0 0 1 0
# 1723 2017 873.74 0 1 1 0 1