-2

Can you please help me with some code? I need to make it as fast as possible (within two days :/ ), but I have some trouble making it. It is pretty simple, but I cannot catch my mistake in it.

I have a little problem looping over the csv file. I have a csv table and I need to go through all of the values in it and extract in separate lists the name of the column and the name of the row (which correspond to this value).

So in the very end I need to have 3 lists, which contain the same number of objects, because then I will need to make a graph using them.

My main goal is to graph the data. If you see any other ways to graph given table, can you please let me know.

When I tried to loop over the file, I get the message:

ValueError: too many values to unpack (expected 2)

Here is the code:

import pandas as pd
df = pd.DataFrame({'0.05 ml : 25 ml': {'7th generation': 0,  'Dawn Ultra': 0,  'Mrs Meyers': 1,  'Ultra Palmolive': 0}, '0.05 ml : 37.5 ml': {'7th generation': 0,  'Dawn Ultra': 3,  'Mrs Meyers': 0,  'Ultra Palmolive': 0}, '0.05 ml : 50 ml': {'7th generation': 0,  'Dawn Ultra': 0,  'Mrs Meyers': 0,  'Ultra Palmolive': 0}, '0.05 ml : 62.5 ml': {'7th generation': 0,  'Dawn Ultra': 0,  'Mrs Meyers': 4,  'Ultra Palmolive': 1}, '0.05 ml : 75 ml': {'7th generation': 0,  'Dawn Ultra': 1,  'Mrs Meyers': 1,  'Ultra Palmolive': 6}})
df.head()

detergent = []
concentration = []
num_colon = []
for (x, y) in df:
    loc = df.iloc[x, y]
    detergent.append(loc.index.values)
    concentration.append(loc.columns.values)
    num_colon.append(loc.values)
print(concentration)
print(detergent)
print(num_colon)

Output:

ValueError: too many values to unpack (expected 2)

The "Bio_sample_test.csv" looks like this: Bio_sample_test.csv

The table is pretty simple. And, as far as I understand, when I will be plotting it, I have to make a 3d graph?

Also, if you need this, here is the code which I plan to use to build the graph:

%matplotlib notebook 
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(detergent, concentration, num_colon, c=values)
ax.set_xlabel("$detergent$")
ax.set_ylabel("$concentration$")
ax.set_zlabel("$num_colon$");

It would be also nice if somebody could help me to make the lists smooth, homogeneous. I mean, not that the final list is a list of arrays, but of independent values.

  • Can you add your data in a [reproducible](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) format? E.g. as a CSV or using `df.to_dict()`. – Nick ODell Feb 23 '23 at 18:51
  • 1
    @NickODell, did you mean to share the csv file here? If so, I added it in the question. – PythonLover Feb 23 '23 at 19:12
  • @NickODell, I did some corrections to the code, because it seemed like I copied the old version. The code needs to have a for-loop instead of a specific cell. – PythonLover Feb 23 '23 at 21:03

1 Answers1

0

I got the answer to my own question:

pre_conc = list(df.columns.values)
conc = [item for item in pre_conc for _ in range(4)]

pre_detergent = []
for i in df.index:
    pre_detergent.append(i)
detergent = [item for item in pre_detergent for _ in range(5)]

pre_values = df.values.tolist()
values = []
for sublist in pre_values:
    for val in sublist:
        values.append(val)

from sklearn.preprocessing import LabelEncoder

x_le = LabelEncoder()
y_le = LabelEncoder()
x_num = x_le.fit_transform(detergent)
y_num = y_le.fit_transform(conc)

from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure() 
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x_num, y_num, values, c=y_num)

ax.set_xticks(range(0,len(x_le.classes_)))
ax.set_xticklabels(x_le.classes_)

ax.set_yticks(range(0,len(y_le.classes_)))
ax.set_yticklabels(y_le.classes_)