I am working with some data where one of the columns looks like
21070808(136)|19995886(87)|21280165(66)
20226255(57)|21440646(54)
...
Just to be clear, this is a single column. Each number which is not in parenthesis represents a publication id (e.g., 21070808) and the number in parenthesis represents the number of citations that this publication received (e.g., publication 21070808 received 136 citations).
For each observation, I would like to count the number of publications as well as the total number of citations. For instance taking the 2 observations above, I would like to get 2 columns (column1=Number of publications and column2=Citations):
Number of publications - Citations
3 - 289
2 - 111
I have tried to look for solutions in R/Stata but could not get anything to work. I think for the number of publications I could just count the number of "|" character and add +1. But for the number of total citations, I am a bit more confused...
Any help would be really appreciated. I am indifferent between R/Stata (and even Python) :)