Right now I have a dataset of 1206 participants who have each endorsed a certain number of traumatic experiences and a number of symptoms associated with the trauma.
This is part of my dataframe (full dataframe is 1206 rows long):
SubjectID | PTSD_Symptom_Sum | PTSD_Trauma_Sum |
---|---|---|
1223 | 3 | 5 |
1224 | 4 | 2 |
1225 | 2 | 6 |
1226 | 0 | 3 |
I have two issues that I am trying to figure out:
- I was able to create a scatter plot, but I can't tell from this plot how many participants are in each data point. Is there any easy way to see the number of subjects in each data point?
I used this code to create the scatterplot:
plt.scatter(PTSD['PTSD_Symptom_SUM'], PTSD['PTSD_Trauma_SUM'])
plt.title('Trauma Sum vs. Symptoms')
plt.xlabel('Symptoms')
plt.ylabel('Trauma Sum')
- I haven't been able to successfully produce a list of the number of people endorsing each pair of items (symptoms and trauma number). I am able to run this code to create the counts for the number of people in each category: :
count_sum= PTSD['PTSD_SUM'].value_counts()
count_symptom_sum= PTSD['PTSD_symptom_SUM'].value_counts()
print(count_sum)
print(count_symptom_sum)
Which produces this output:
0 379
1 371
2 248
3 130
4 47
5 17
6 11
8 2
7 1
Name: PTSD_SUM, dtype: int64
0 437
1 418
2 247
3 74
4 23
5 4
6 3
Name: PTSD_symptom_SUM, dtype: int64
Is it possible to alter the code to count the number of people endorsing each pair of items (symptom number and trauma number)? If not, are there any functions that would allow me to do this?