1

I am trying to get all the numerical columns plotted within a tight layout with a mean line in each of the subplots for the average of each column. I have tried several options and get the following error:

the truth value of a series is ambiguous. use a.empty, a.bool(), a.item(), a.any() or a.all()

FYI: the code works without the plt.axvline

This is the code I have tried:

from scipy.stats import norm
all_col = data.select_dtypes(include=np.number).columns.tolist()

plt.figure(figsize=(17,75))

for i in range(len(all_col)):
    plt.subplot(18,3,i+1)
    sns.distplot(data[all_col[i]])
    plt.tight_layout()
    plt.title(all_col[i],fontsize=25)
    plt.axvline(data.mean(), color='k', linestyle='dashed', linewidth=1)#displaying the mean on the chart

plt.show()
Flavia Giammarino
  • 7,987
  • 11
  • 30
  • 40
Nora
  • 33
  • 4
  • 1
    please add some sample data!. I used `import seaborn as sns; data = sns.load_dataset('tips')` to generate some. – flurble Feb 14 '21 at 20:40

1 Answers1

2

If I use a sample dataset, the error lies in plt.axvline(data.mean()), since data.mean() lists the means of all columns and axvline draws only one line at one x value.

I would do all this as follows:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

data = sns.load_dataset('tips')  # Sample data

num = data.select_dtypes(include=np.number)  # Get numeric columns
n = num.shape[1]  # Number of cols

fig, axes = plt.subplots(n, 1, figsize=(14/2.54, 12/2.54))  # create subplots

for ax, col in zip(axes, num):  # For each column...
    sns.distplot(num[col], ax=ax)   # Plot histogaerm
    ax.axvline(num[col].mean(), c='k')  # Plot mean

plot

flurble
  • 1,086
  • 7
  • 21