I have a dictionary in which key is the length of a particular column and value is the number of times the column value had that particular length. Can anyone tell me how do I plot a boxplot with this data? Do I have to convert it to a list and then call the plot?
from collections import defaultdict
sample_data = [1, 2, 3, 1, 1, 2, 4, 5]
sample_dict = defaultdict(int)
for i in sample_data:
sample_dict[i] += 1
print(sample_dict)
defaultdict(<class 'int'>, {1: 3, 2: 2, 3: 1, 4: 1, 5: 1})
The above dictionary is how I have my data currently. My dataset size is huge so I used this way of representation. Is converting the dictionary into a list is the way to plot boxplot i.e., do I need to make a list that contains ? TIA!
My dataframe looks like below, (just showing the head here)
| | len |num_occurrence| dt. |
|---:|-----------:|-------------:|:--------------------|
| 0 | 183 | 599 | 2022-11-24 00:00:00 |
| 1 | 176 | 1029 | 2022-12-15 00:00:00 |
| 2 | 2 | 24 | 2022-12-02 00:00:00 |
| 3 | 18 | 449343 | 2022-12-09 00:00:00 |
| 4 | 45 | 640937 | 2022-12-09 00:00:00 |
Currently I plot like below,
sns.boxplot(x='dt_formatted', y='subd_len', data=subd_pdf);
plt.title('Distribution of length');
plt.xticks(rotation=90);
But it does not take frequency of occurrence into consideration.