1

Im trying to plot a dataframe like this:

A = pd.DataFrame([[1, 5, 2, 8, 2], [2, 4, 4, 20, 2], [3, 3, 1, 20, 2], [4, 2, 2, 1, 0], 
              [5, 1, 4, -5, -4], [1, 5, 2, 2, -20], [2, 4, 4, 3, 0], [3, 3, 1, -1, -1], 
              [4, 2, 2, 0, 0], [5, 1, 4, 20, -2]],
             columns=['a', 'b', 'c', 'd', 'e'],
             index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])


plt.plot(np.cumsum(A.transpose()))

It looks like this:

enter image description here

However, I would like the first print of the chart to start at 0 for all lines. I tried adding another column according to this, but didn't work. For some reason the index didn't change and kept the newly created column at the end in the plot.

A['s'] = 0
cols = list(A)
cols.insert(0, cols.pop(cols.index('s')))
A = A.loc[:, cols]
plt.plot(np.cumsum(A.transpose()))

enter image description here

hernanavella
  • 5,462
  • 8
  • 47
  • 84

2 Answers2

2

Your approach is absolutely correct. The code from the question will produce the desired plot. However, only in matplotlib 2.2. In earlier versions matplotlib would automatically sort the categories alphabetically before plotting, such that row s is last in the axes.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

A = pd.DataFrame([[1, 5, 2, 8, 2], [2, 4, 4, 20, 2], [3, 3, 1, 20, 2], [4, 2, 2, 1, 0], 
              [5, 1, 4, -5, -4], [1, 5, 2, 2, -20], [2, 4, 4, 3, 0], [3, 3, 1, -1, -1], 
              [4, 2, 2, 0, 0], [5, 1, 4, 20, -2]],
             columns=['a', 'b', 'c', 'd', 'e'],
             index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

A['s'] = 0
cols = list(A)
cols.insert(0, cols.pop(cols.index('s')))
A = A.loc[:, cols]

plt.plot(np.cumsum(A.transpose()))

plt.show()

enter image description here

In case you cannot use matplotlib 2.2, you may plot the values without labels and set the labels afterwards.

x = np.arange(len(A.columns))
y = np.cumsum(A.transpose()).values
plt.plot(x,y)
plt.xticks(x, A.columns)
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
2

You can use insert to add a new column with all 0's.

A.insert(0, '0', [0]*10)
  • The first 0 ist the position of your column, in this case the beginning of your dataframe.
  • '0' is the name of the column. As .plot sorts your columns, you either can use something that comes before your other columns (like probably '0') or you have to reorder your columns in your plot.
  • [0]*10 are the values of your new column.

enter image description here

NK_
  • 361
  • 1
  • 4
  • 11
  • 1
    I wonder why naming the column `"0"` is a solution if it is supposed to be named `"s"`? – ImportanceOfBeingErnest Mar 14 '18 at 13:50
  • @ImportanceOfBeingErnest I skipped this part of his question unintentionally, to be honest. However, I am not sure if he really needs the column to be named 's' or if it is just a dummy name. There are a few possibilities for sorting the columns of a plot. You posted one of them. He can also set the order manually depending on his data. Nevertheless, you're right. – NK_ Mar 14 '18 at 14:26