Python : How do I easily get sub-totals on row and column?

Question

I am new to Python and am pulling my hair out trying to get Pandas/XlsWriter to do what i want.

I started with a set of data from a CSV which i loaded into a Dataframe and manipulated. Eventually, after doing a PIVOT, I ended up with the data as seen below. The PINK highlighted represents what I am trying to get Pandas/XLSwriter to produce programatically. so the number of rows and columns will be variable.

Two questions :

1) Please advise how I can find subtotals per row and column ?

2) How do I apply number formatting to get it to be xxx,xxx,xxx.00 ?

BONUS QUESTION :

Would you recommend OpenPYXL over XLSWriter (or vice versa) and why ?

jack6e · Answer 1 · 2017-09-11T16:15:31.227

1

This is mostly a duplicate question: Pandas: sum DataFrame rows for given columns

You are looking for df.sum() per the docs: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sum.html

For your specific case it would look like:

# New column with row totals
df['Daily Total'] = df.sum(axis=1, numeric_only=True)
# New row with column totals
df = df.append(df.sum(numeric_only=True), ignore_index=True)

Formatting is just:

df[col] = df[col].map('{:,.2f}'.format)

Or to format everything that is not the date/counterparty column, in case that is not the index, you can do:

df = df.apply(lambda col: col.map('{:,.2f}'.format) if col.name != 'Counterparty' else col)

edited Sep 11 '17 at 16:15

answered Sep 11 '17 at 15:08

jack6e

1,512
10
12

thanks for the response. The addition of new column works perfectly. but the addition of new row with column totals is a problem. Its nukes my date index column. the append requires "ignore_index" but this results in my table losing the date column (replaced by generic numeric index). I thought about a workaround, but having issues with syntax. I am trying to take the current date based index and add it to the FRONT of the dataframe. Then, I will be able to run your code and the maturity date will be the first column when i output (ignoring index) – Hiro Sep 27 '17 at 10:09
The solution for that is `df = df.reset_index()`. – jack6e Sep 27 '17 at 13:24

score 0 · Answer 2 · answered Sep 11 '17 at 15:07

0

df["Daily Total"] = df["Company A"] + df["Company B"] + df["Company C"]+ df["Company D"]

answered Sep 11 '17 at 15:07

Yuval Raz

96
7

He stated specifically "so the number of rows and columns will be variable." Your answer does not address variable numbers of columns or column names. – jack6e Sep 11 '17 at 15:09

Python : How do I easily get sub-totals on row and column?

2 Answers2