0

There is one existing answer to this question here but it is wrong.

In the example dataframe from the previous question, the US has the highest number of python users (10,110), yet in the graph it appears as though France has the highest instead.

Can someone help me fix the solutions code?

Data Frame

Resulting Graph (incorrect)

EXAMPLE DATAFRAME:

EG 

Language    C     C++     Java    Python    Perl

Country

USA          3222   343     2112   10110      89

France      5432   323     1019     678        789

Japan       7878   467       767     8788       40

INCORRECT CODE:

from mpl_toolkits.mplot3d import Axes3D

# thickness of the bars
dx, dy = .8, .8

# prepare 3d axes
fig = plt.figure(figsize=(10,6))
ax = Axes3D(fig)

# set up positions for the bars 
xpos=np.arange(eg.shape[0])
ypos=np.arange(eg.shape[1])

# set the ticks in the middle of the bars
ax.set_xticks(xpos + dx/2)
ax.set_yticks(ypos + dy/2)

# create meshgrid 
# print xpos before and after this block if not clear
xpos, ypos = np.meshgrid(xpos, ypos)
xpos = xpos.flatten()
ypos = ypos.flatten()

# the bars starts from 0 attitude
zpos=np.zeros(eg.shape).flatten()

# the bars' heights
dz = eg.values.ravel()

# plot 
ax.bar3d(xpos,ypos,zpos,dx,dy,dz)

# put the column / index labels
ax.w_yaxis.set_ticklabels(eg.columns)
ax.w_xaxis.set_ticklabels(eg.index)

# name the axes
ax.set_xlabel('Country')
ax.set_ylabel('Language')
ax.set_zlabel('Count')

plt.show()
SamPom100
  • 3
  • 1
  • Does this answer your question? [Plotting Pandas Crosstab Dataframe into 3D bar chart](https://stackoverflow.com/questions/56336066/plotting-pandas-crosstab-dataframe-into-3d-bar-chart) – NotAName Jun 03 '20 at 08:40

1 Answers1

1

To solve it, just change the ravel part of the code:

# the bars' heights
dz = eg.values.ravel(order='F')

That order='F'reads the data correctly for your problem:

‘F’ means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest.

The code you provided does not work as you expected because the xpos and ypos positional arrays are not sorted as the dz array obtained via eg.values.ravel():

eg.values.ravel()
>> array([ 3222,   343,  2112, 10110,    89,  5432,   323,  1019,   678,
         789,  7878,   467,   767,  8788,    40], dtype=int64)

This array (the 'heights' of the chart) concatenates the values of eg's rows. In another terms, dz grabs eg terms in the following order:

(0,0), (0,1), (0,2), (0,3), (1,0)...

xpos and ypos, however, are listing values along the columns:

list(zip(xpos, ypos))
>>[(0, 0),(1, 0),(2, 0),(0, 1),(1, 1),(2, 1),(0, 2),...]

So your values get incorrectly assigned. For instance, (1,0) - that is, France, C - received the value from (0,1) - USA, C++. That's why the values on the chart are messed up.

Hope it helps!

t.novaes
  • 126
  • 2