I am working in Numpy and Pandas.
I have a table of loans with the features country and sector.
# LOAN | SECTOR | COUNTRY |
---|---|---|
Loan 1 | food | germany |
Loan 2 | telecom | italy |
Loan 3 | auto | japan |
Loan 4 | food | japan |
Loan 5 | telecom | germany |
Loan 6 | auto | italy |
I need to drop the duplicates by the sector and the country, ie. select the unique values of these 2 features, and use them as columns creating a table with boolean 1/0 if the loan is active in that country or sector, as follows:
# LOAN | food | telecom | auto | germany | italy | japan |
---|---|---|---|---|---|---|
Loan 1 | 1 | 0 | 0 | 1 | 0 | 0 |
Loan 2 | 0 | 1 | 0 | 0 | 1 | 0 |
Loan 3 | 0 | 0 | 1 | 0 | 0 | 1 |
Loan 4 | 1 | 0 | 0 | 0 | 0 | 1 |
Loan 5 | 0 | 1 | 0 | 1 | 0 | 0 |
Loan 6 | 0 | 0 | 1 | 0 | 1 | 0 |
So, Loan1 in the first table had food as sector and germany as country; then, in the second table it has 1 on columns food and germany and 0 on all the other columns.
It seems a pivot_table but I don't understand how I could put the 1/0 as values? Btw, what's the easiest way?
Thanks