0

here are some difficult problems for me.

Now, I have a dataframe in python. It looks like this:

chromosome_id  start_site  stop_site strand          gene_id
0                   1       12228      12612      +  ENST00000456328
1                   1       12722      13220      +  ENST00000456328
2                   1       12058      12178      +  ENST00000450305
3                   1       12228      12612      +  ENST00000450305
4                   1       12698      12974      +  ENST00000450305

In the last column,there are some duplicates. I want to get a new column that tell me the time of duplication. like this 1 2 1 2 3

I don't know if I clearly expressed it. For another example, I have a list ,such as "apple apple apple banana banana orange orange orange orange ". Then I want to get a new list like this "1,2,3,1,2,1,2,3,4"

How can I do this by python?

Thank you in advanced!

yatu
  • 86,083
  • 12
  • 84
  • 139
ruiyan hou
  • 11
  • 4

0 Answers0