1

Could anyone help me?

I need to draw a plot from the dataframe but I have no idea how to draw it. So my ideal plot look like this, which means each x-axis provides multiple values(and they absolutely can't be covered by each other). Ideal image

The below codes is to create a random dataframe, so you can try from it. I'll appreciate you so much if someone help me!!!

import pandas as pd
import numpy as np

random_data = np.random.randint(10,25,size=(5,3))
df = pd.DataFrame(random_data, columns=['Column_1','Column_2','Column_3'])
print(df)

Actually My data look like this, so it means there are a to k columns, and each of them have 8 values(some of them are empty) enter image description here

Megan
  • 541
  • 1
  • 3
  • 14

1 Answers1

1

With the toy dataframe you provided, here is one way to do it:

# Prepare data for plotting
new_df = pd.concat(
    [
        pd.DataFrame(
            {
                "x": [i + j * 10 - 1 for i in range(1, len(df[col]) + 1)],
                "value": df[col],
                "label": col,
            }
        )
        for j, col in enumerate(df.columns)
    ]
).reset_index(drop=True)
print(new_df)
# Output
    x   value   label
0   0   14  Column_1
1   1   22  Column_1
2   2   20  Column_1
3   3   11  Column_1
4   4   21  Column_1
5   10  18  Column_2
6   11  17  Column_2
7   12  21  Column_2
8   13  18  Column_2
9   14  15  Column_2
10  20  19  Column_3
11  21  18  Column_3
12  22  24  Column_3
13  23  17  Column_3
14  24  14  Column_3

Then, you can plot like this:

from matplotlib import pyplot as plt

fig, ax = plt.subplots(nrows=1, ncols=1)

# Remove borders
ax.spines["right"].set_visible(False)
ax.spines["top"].set_visible(False)

# Position labels on x-axis 
ax.set_xticks(
    ticks=[
        new_df.loc[new_df["label"] == label, "x"].median()
        for label in new_df["label"].unique()
    ]
)
ax.set_xticklabels(new_df["label"].unique(), fontsize=12)

# Plot values
for label in new_df["label"].unique():
    ax.scatter(
        new_df.loc[new_df["label"] == label, "x"],
        new_df.loc[new_df["label"] == label, "value"],
    )

plt.show()

Which outputs:

enter image description here

Laurent
  • 12,287
  • 7
  • 21
  • 37
  • I appreciate it!! Sorry to reply late~ – Megan Jun 06 '22 at 16:18
  • Hi Laurent, may I ask one more question? I want to change it to histogram with error bar. I know just change from `scatter` to `bar`, but I have no idea how to add those error bar. – Megan Jun 06 '22 at 17:02
  • Hi, I'm curious about how to adjust the width between different names columns? – Megan Jun 07 '22 at 10:07
  • Hi @Megan, on your first comment, I suggest you post another question, as per SO guidelines, it will be easier to help that way. As for adjusting the width between tick names, you can tweak the `ticks` parameter of `ax.set_xticks` method. For additional guidance, check https://matplotlib.org/stable/api/_as_gen/matplotlib.axis.Axis.set_ticks.html. Cheers – Laurent Jun 07 '22 at 17:39
  • Thanks Lauent! Yeh, I posted that question yesterday. LOL – Megan Jun 07 '22 at 23:41