I have a dataframe with one hot encoded categorical labels that looks like this:
id label1: val1 label1: val2 label2: val1 label2: val2
12 0 1 0 1
13 1 0 0 1
14 0 0 1 0
I want to transform this into something like this:
id label1 label2
12 val2 val2
13 val1 val2
14 Na val1
I am totally confused about how this can be achieved. Can anyone please guide me? The naive process I can think of is iterating over all columns and assign values accordingly:
for i in issues.columns[30:]:
for x in issues[str(i)]:
if x==0: continue
issues[i.split(':')[0]][x]=i.split(':')[1]
But this does not seem to work (maybe I am getting the logic wrong?)
edit ----------------
I need to replicate the solution for all similar hot-encoded columns in the same df.
df.columns[30:]
['Aspect: Documentation', 'Aspect: Packaging', 'Aspect: Performance',
'Aspect: Security', 'Aspect: Testing', 'Aspect: UX', 'Backport: 14',
'Backport: 15', 'Design Proposal: Accepted. PRs Welcome', 'Epic',
'Expeditor: Bump Version Minor', 'Expeditor: Skip All',
'Expeditor: Skip Changelog', 'Expeditor: Skip Habitat',
'Expeditor: Skip Omnibus', 'Expeditor: Skip Version Bump',
'Focus: Chef Server', 'Focus: Config', 'Focus: Data collector',
'Focus: SCM Resources', 'Focus: knife', 'Focus: knife bootstrap',
'Focus: knife ssh', 'Platform: Debian-like', 'Platform: Linux',
'Platform: RHEL-like', 'Platform: Unix-like', 'Platform: Windows',
'Platform: macOS', 'Priority: Critical', 'Priority: Low',
'Priority: Medium', 'Status: Adopted', 'Status: Good First Issue',
'Status: Help Wanted', 'Status: Incomplete', 'Status: Needs RFC',
'Status: Needs Replication', 'Status: Sustaining Backlog',
'Status: Untriaged', 'Status: Waiting on Contributor',
'Triage: Confirmed', 'Triage: Declined', 'Triage: Duplicate',
'Triage: Feature Request', 'Triage: Needs Information',
'Triage: Support', 'Type: Breaking Change', 'Type: Bug',
'Type: Build Breakage', 'Type: Chore', 'Type: Deprecation',
'Type: Design Proposal', 'Type: Enhancement', 'Type: Regression',
'Type: Tech Debt', 'Aspect: Integration', 'Aspect: Stability',
'Focus: knife search']
If one row has multiple labels associated with it (say Type:Tech Debt and Type: Enhancement
), we can add the labels one after in the corresponding row (ie Tech Debt,Enhancement
)