I have a dataset that looks like this:
UserID Query Asthma Stroke
142 abc dr 0 0
142 asthma 1 0
142 stroke 0 1
145 stroke 0 1
145 pizza 0 0
There are hundreds of thousands of UserIDs and each user submitted a variable number of queries. In order to do further analysis, I need to sum "Asthma" and "Stroke" for each UserID. Any advice? Can you recommend resources for dealing with this type of dataset?
Thank you in advance... I'm very new to this.