I recently started running the newest version of anaconda (2018.12 with py37_0) and I am using jupyter notebooks to run my code. Prior to installing this version I was using Python 3.2.2 and the code presented below would run smoothly. The code uses Seaborn to produce a correlation plot using variables taken from a pandas dataframe. However, now I get "IndexError: tuple index out of range" and I don't know how to fix that error.
A similar problem has been reported previously:
Neither of this solutions seems to work for me.
Lastly, the code to plot the correlation for the variables in my data frame comes from:
The data to create a pandas dataframe was taken from a csv file of a Kaggle competition:
https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/
I used the "winequality-white.csv" file
import pandas as np
import seaborn as sns
import numpy as np
df = pd.read_csv('winequality-white.csv')
def corrdot(*args, **kwargs):
corr_r = args[0].corr(args[1], 'pearson')
corr_text = f"{corr_r:2.2f}".replace("0.", ".")
ax = plt.gca()
ax.set_axis_off()
marker_size = abs(corr_r) * 10000
ax.scatter(.5, .5, marker_size, corr_r, alpha=0.6, cmap="coolwarm",
vmin=-1, vmax=1, transform=ax.transAxes)
font_size = abs(corr_r) * 40 + 5
ax.annotate(corr_text, [.5, .5,], xycoords="axes fraction",
ha='center', va='center', fontsize=font_size)
sns.set(style='white', font_scale=1.6)
g = sns.PairGrid(df, aspect=1.4, diag_sharey=False)
g.map_lower(sns.regplot, lowess=True, ci=False, line_kws={'color': 'black'})
g.map_diag(sns.distplot, kde_kws={'color': 'black'})
g.map_upper(corrdot)
Expected results can be found as the answer for:
Actual results:
C:\Users\Public\anaconda3\lib\site-packages\scipy\stats\stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use arr[tuple(seq)]
instead of arr[seq]
. In the future this will be interpreted as an array index, arr[np.array(seq)]
, which will result either in an error or a different result.
return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
Traceback:
IndexError Traceback (most recent call last)
<ipython-input-8-68b4a938aa72> in <module>
17 g.map_lower(sns.regplot, lowess=True, ci=False, line_kws={'color': 'black'})
18 g.map_diag(sns.distplot, kde_kws={'color': 'black'})
---> 19 g.map_upper(corrdot)
C:\Users\Public\anaconda3\lib\site-packages\seaborn\axisgrid.py in map_upper(self, func, **kwargs)
1488 color = self.palette[k] if kw_color is None else kw_color
1489 func(data_k[x_var], data_k[y_var], label=label_k,
-> 1490 color=color, **kwargs)
1491
1492 self._clean_axis(ax)
<ipython-input-8-68b4a938aa72> in corrdot(*args, **kwargs)
7 marker_size = abs(corr_r) * 10000
8 ax.scatter(.5, .5, marker_size, corr_r, alpha=0.6, cmap="coolwarm",
----> 9 vmin=-1, vmax=1, transform=ax.transAxes)
10 font_size = abs(corr_r) * 40 + 5
11 ax.annotate(corr_text, [.5, .5,], xycoords="axes fraction",
C:\Users\Public\anaconda3\lib\site-packages\matplotlib\__init__.py in inner(ax, data, *args, **kwargs)
1808 "the Matplotlib list!)" % (label_namer, func.__name__),
1809 RuntimeWarning, stacklevel=2)
-> 1810 return func(ax, *args, **kwargs)
1811
1812 inner.__doc__ = _add_data_doc(inner.__doc__,
C:\Users\Public\anaconda3\lib\site-packages\matplotlib\axes\_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, **kwargs)
4209 try: # First, does 'c' look suitable for value-mapping?
4210 c_array = np.asanyarray(c, dtype=float)
-> 4211 n_elem = c_array.shape[0]
4212 if c_array.shape in xy_shape:
4213 c = np.ma.ravel(c_array)
IndexError: tuple index out of range