I have the following problem: My R data set consists of two columns (anonymized usage data of a software).
data.frame(cmid=c(925390,925390,935392,935393,935392), userid=c(14686,14686,14686,96350,44451))
From this data set I would like to create a new data set, which lists the userid in the rows and the summed cmids in the columns, so that each userid or cmid occurs only once. accordingly, the data set should look like this:
userid | 925390 | 935392 | 935393 |
---|---|---|---|
14686 | 2 | 1 | 0 |
44481 | 0 | 1 | 0 |
96350 | 0 | 0 | 1 |
Since the dataset consists of 40717 lines and accordingly the userid/cmid number is very large, I am looking for an automated solution. I am lacking any approach for this at the moment. I have already tried to get further with ```summariseor
count`` functions, but unfortunately without any success....
Does anyone have a tip for not?