Good day. The problem is the following - when trying to remove outliers from one of the columns in the table
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy import stats
import numpy as np
df = pd.read_csv("https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DA0321EN-SkillsNetwork/LargeData/m2_survey_data.csv")
df["ConvertedComp"].plot(kind="box", figsize=(10,10))
z_scores = stats.zscore(df["ConvertedComp"])
abs_z_scores = np.abs(z_scores)
filtered_entries = (abs_z_scores < 3).all(axis=1)
new_df = df[filtered_entries]
the following error crashes.
---------------------------------------------------------------------------
AxisError Traceback (most recent call last)
<ipython-input-133-7811da442811> in <module>
4 z_scores
5 abs_z_scores = np.abs(z_scores)
----> 6 filtered_entries = (abs_z_scores < 3).all(axis=1)
7 #new_df = df[filtered_entries]
C:\ProgramData\WatsonStudioDesktop\miniconda3\envs\desktop\lib\site-packages\numpy\core\_methods.py in _all(a, axis, dtype, out, keepdims)
44
45 def _all(a, axis=None, dtype=None, out=None, keepdims=False):
---> 46 return umr_all(a, axis, dtype, out, keepdims)
47
48 def _count_reduce_items(arr, axis):
AxisError: axis 1 is out of bounds for array of dimension 1
I would be grateful for your advice, ideas are almost over