1

I have set of data with 3 columns Label, Year and Total. My total count is based on the group of label and year.

+--------------------+-------+-------+
|               Label|   Year|  Total|
+--------------------+-------+-------+
|                 FTP|02/2018| 193360|
|              BBBB  |01/1970|     14|
|              BBBB  |02/2018|4567511|
|                SSSS|02/2018| 187589|
|                Dddd|02/2018|  41508|

I want to plot the data like in this below image. enter image description here How to achieve this with stacked area chart in Pandas python. ( my x-axis should have both my label and year values and based on that grouping of y-axis should plot values )

The code I tried with seaborn as well normal

dF.plot(figsize=(20,8), x =['Label','Year'], y ='Total', kind = 'area', stacked = True)

ax = df.plot(x="label", y="Total", legend=False, figsize=(10,8))
ax2 = ax.twinx()
df.plot(x="label", y="Dst_Port", ax=ax2, legend=False, color="r", figsize=(10,8))
ax.figure.legend()
plt.show()

My current graph can plot with single x-axis column value.

Kavya Shree
  • 1,014
  • 2
  • 17
  • 52

1 Answers1

1

With help from this post for plotting the category grid lines:

  • Group by data with "Label", "Year" and sum the "Total".
  • Plot as follows
import matplotlib.pyplot as plt

df = pd.DataFrame(data=[["FTP","02/2018",1000],["BBBB","02/2018",1500],["SSSS","02/2018",1400],["Dddd","02/2018",3000],["FTP","02/2017",1800],["BBBB","02/2017",1700],["SSSS","02/2017",1600],["Dddd","02/2017",1500]], columns=["Label","Year","Total"])

df = df.groupby(["Label", "Year"]) \
       .agg(Total=("Total","sum")) 

def add_line(ax, xpos, ypos):
    line = plt.Line2D([xpos, xpos], [ypos + .1, ypos],
                      transform=ax.transAxes, color='gray')
    line.set_clip_on(False)
    ax.add_line(line)

def label_len(my_index,level):
    labels = my_index.get_level_values(level)
    return [(k, sum(1 for i in g)) for k,g in itertools.groupby(labels)]

def label_group_bar_table(ax, df):
    ypos = -.1
    scale = 1./df.index.size
    for level in range(df.index.nlevels)[::-1]:
        pos = 0
        for label, rpos in label_len(df.index,level):
            lxpos = (pos + .5 * rpos)*scale
            ax.text(lxpos, ypos, label, ha='center', transform=ax.transAxes)
            add_line(ax, pos*scale, ypos)
            pos += rpos
        add_line(ax, pos*scale , ypos)
        ypos -= .1

ax = df.plot.area(figsize=(20,5))
ax.set_xticklabels("")
ax.set_xlabel("")
label_group_bar_table(ax, df)

enter image description here

Azhar Khan
  • 3,829
  • 11
  • 26
  • 32
  • Invalid argument, not a string or column: .. at 0x7fc0b11a5740> of type . For column literals, use 'lit', 'array', 'struct' or 'create_map' function. getting this error in line no of return [(k, sum(1 for i in g)) for k,g in itertools.groupby(labels)]. or getting error with itertools not defined – Kavya Shree Dec 12 '22 at 12:16
  • Please share a sample dataframe creation code for your use case. – Azhar Khan Dec 12 '22 at 12:52
  • I have taken the data from the table using spark.sql('select Label,Total,Year from sqlview') and converted pandasDF=sqlDFF.toPandas() df = pandasDF.groupby(["Label", "Year"]) \ .agg(Total=("Total","sum")) I am new to python, bare me if my questions are bad – Kavya Shree Dec 12 '22 at 13:13
  • DataFrame[Label: string, Year: string, Total: bigint] this is my spark datafram after retrieve from sql and after converted to pandas Label Year Total 0 AAA 02/2018 193360 1 BBB 01/1970 14 2 BBB 02/2018 4567511 – Kavya Shree Dec 12 '22 at 13:15
  • Are you plotting on pyspark df? You should plot on pandas df. – Azhar Khan Dec 12 '22 at 13:17
  • yes i am plotting using pandas ly. after retrieving data I will convert spark dataframe to pandas – Kavya Shree Dec 12 '22 at 13:18
  • Your error is for spark dataframe. What is output of `pandasDF.dtypes`? – Azhar Khan Dec 12 '22 at 13:19
  • Label object Year object Total int64 dtype: object – Kavya Shree Dec 12 '22 at 13:22
  • Check your imports and restart python kernel. Looks like you have a different `sum` function imported. – Azhar Khan Dec 12 '22 at 13:25
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/250347/discussion-between-kavya-shree-and-azhar-khan). – Kavya Shree Dec 12 '22 at 13:30