I have a large dataframe with 2 columns: The one has a discrete number of values that appear repetitively, while the other only has unique values. Essentially multiple values in column 2 will correspond to one value in column 1.
As the data has currently been acquired, it lists each unique variable in column 2 as a row, which means there are repeated values in column 1.
I want to transform (essentially flip) the data so that I can see which column 2 values fall under each unique value in column 1.
For example, the df is:
Contig | Gene |
---|---|
C20 | G1 |
C10 | G2 |
C40 | G3 |
C20 | G4 |
C40 | G5 |
C30 | G6 |
And I want:
Contig | Gene |
---|---|
C10 | G2 |
C20 | G1, G4 |
C30 | G6 |
C40 | G3, G5 |
If I only get the number of unique values that will also be okay:
Contig | Gene(s) |
---|---|
C10 | 1 |
C20 | 2 |
C30 | 1 |
C40 | 2 |
I hope it makes sense. I've been struggling to find the right keywords to explain this issue and really don't know where to begin. Although I get the feeling I should maybe turn the data into a list.