I have the following Pandas Dataframe.
data = pd.DataFrame(
{
"client": ["first", "second", "third", "fourth", "fifth", "sixth", "seventh", "eighth", "ninth", "tenth", "eleventh"],
"Lifetime": [24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24],
"Tokens": [30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30],
"path": ["kyc", "co", "5dimes", "la", "la", "ku", "pv", "ipv", "lv", "7d", "222"],
"requiredFields": [
['address', 'city', 'country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'ssn', 'state', 'zip'],
['address', 'country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'ssn', 'state', 'zip'],
['address', 'country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'state', 'zip'],
['city', 'country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'ssn', 'state', 'zip'],
['city', 'country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'ssn', 'zip'],
['city', 'country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'ssn'],
['city', 'country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'state', 'zip'],
['country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'ssn', 'state', 'zip'],
['country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'ssn', 'zip'],
['country', 'dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName', 'state', 'zip'],
['dobDay', 'dobMonth', 'dobYear', 'firstName', 'lastName']
],
"userIdRequired": [True, True, True, True, True, True, True, True, True, True, True],
}
) What I want to do is to make each item in the list go to a separate column. The result is a list item as a column name and its value "y". Something like this.
client | Lifetime | Tokens | path | requiredFields | userIdRequired | address | city | country | dobDay | dobMonth | dobYear | firstName | lastName | ssn | state | zip |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
first | 24 | 30 | kyc | [address, city, country, dobDay, dobMonth, dobYear, firstName, lastName, ssn, state, zip] | True | y | y | y | y | y | y | y | y | y | y | y |
second | 24 | 30 | co | [address, city, country, dobDay, dobMonth, dobYear, firstName, lastName, ssn, state, zip] | True | y | None | y | y | y | y | y | y | y | y | y |
third | 24 | 30 | 5dimes | [address, city, country, dobDay, dobMonth, dobYear, firstName, lastName, state, zip] | True | y | y | y | y | y | y | y | y | y | y | |
fourth | 24 | 30 | la | [city, country, dobDay, dobMonth, dobYear, firstName, lastName, ssn, state, zip] | True | None | y | y | y | y | y | y | y | y | y | y |
fifth | 24 | 30 | la | [city, country, dobDay, dobMonth, dobYear, firstName, lastName, ssn, zip] | True | None | y | y | y | y | y | y | y | y | None | y |
sixth | 24 | 30 | ku | [city, country, dobDay, dobMonth, dobYear, firstName, lastName, ssn] | True | None | y | y | y | y | y | y | y | y | None | None |
seventh | 24 | 30 | pv | [city, country, dobDay, dobMonth, dobYear, firstName, lastName, state, zip] | True | None | y | y | y | y | y | y | y | None | y | y |
eighth | 24 | 30 | ipv | [country, dobDay, dobMonth, dobYear, firstName, lastName, ssn, state, zip] | True | None | None | y | y | y | y | y | y | y | y | y |
ninth | 24 | 30 | lv | [country, dobDay, dobMonth, dobYear, firstName, lastName, ssn, zip] | True | None | None | y | y | y | y | y | y | y | None | y |
tenth | 24 | 30 | 7d | [country, dobDay, dobMonth, dobYear, firstName, lastName, state, zip] | True | None | None | y | y | y | y | y | y | None | y | y |
eleventh | 24 | 30 | 222 | [dobDay, dobMonth, dobYear, firstName, lastName] | True | None | None | None | y | y | y | y | y | None | None | None |
I can't use apply
pandas series
or explode
or something similar, because then I will have different value order by columns. I also tried to use but with this solution Pandas split a column of unequal length lists into multiple boolean columns, but it generates duplicated columns.