0

I have a csv file that reads the data from the "Toppings" column, however, cells with multiple values appear as one value with "\r\n" (e.g. corn\r\ncucumber\r\npickles). Here is what the data looks like, and here is what the plot currently looks like (without any attempts at splitting).

How do I fix my code so that I split, count, and plot the total numbers of each topping request for supplies (separately)? I do not want to manually set split lines for each possible combination, because there are many more items and combinations that I have excluded for simplicity.

Here is my base code (without any attempts at splitting). Please be as specific as possible as I am still getting used to Python terms.

#These are a few libraries I have been working with.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns
import math
import io
import textwrap
%matplotlib inline
sns.set_style('ticks')
import warnings
warnings.filterwarnings('ignore')

#access file
data = pd.read_csv('E:/testing_data.csv')
supplies = pd.DataFrame(data)

#plotting
sns.set(style="darkgrid")
total = float(len(supplies))
ax = sns.countplot(x='Toppings', 
                   hue="Toppings", 
                   data=supplies, dodge=False, order = supplies['Toppings'].value_counts().index);
plt.legend(loc='upper right', title='Supplies Needed')
plt.title('Distribution of Supplies');
plt.xlabel('Supplies', fontsize=15)
plt.ylabel('Number of Occurrences', fontsize=15)
plt.rcParams["figure.figsize"] = (15,10)
plt.rcParams['font.size'] = '12'

for p in ax.patches:
    percentage = '{:.1f}%'.format(100 * p.get_height()/total)
    x = p.get_x() + p.get_width()/2
    y = p.get_height() + .01
    ax.annotate(percentage, (x, y), ha='center', color='black', size=10)

labels = [textwrap.fill(label.get_text(), 15) for label in ax.get_xticklabels()]
ax.set_xticklabels(labels);

0 Answers0