0

I have a dataframe that looks like this:

CURRENT DATAFRAME

Branch ID | Dispense Type | Product ID |   Date   | Quantity
-------------------------------------------------------------
    A     |  Shipped Out  |    ABC3    |  5/6/19  |     10
    A     |  Received     |    ABC3    |  5/6/19  |      8
    A     |  Transferred  |    ABC3    |  5/6/19  |      0

    A     |  Shipped Out  |    ABC3    |  5/7/19  |      5
    A     |  Received     |    ABC3    |  5/7/19  |      7
    A     |  Transferred  |    ABC3    |  5/7/19  |     20

    B     |  Shipped Out  |    ABC3    |  5/6/19  |     40
    B     |  Received     |    ABC3    |  5/6/19  |      0
    B     |  Transferred  |    ABC3    |  5/6/19  |      1

    B     |  Shipped Out  |    ABC3    |  5/7/19  |      6
    B     |  Received     |    ABC3    |  5/7/19  |      2
    B     |  Transferred  |    ABC3    |  5/7/19  |      3

    A     |  Shipped Out  |    QRE3    |  5/6/19  |      7
    A     |  Transferred  |    QRE3    |  5/6/19  |     10
    A     |  Received     |    QRE3    |  5/6/19  |      5

(The spaces between the rows are included here just so that the dataframe is clear and makes sense, I don't need spaces in the actual dataframe generated in my code.)

Ideally, the Dataframe I would like as an end-result would use the column "Dispense Type" to make additional columns, and reduce the number of rows as a result.

DESIRED RESULT DATAFRAME

Branch ID |   Date   | Product ID | Shipped Out | Received | Transferred
------------------------------------------------------------------------
    A     |  5/6/19  |    ABC3    |     10      |    8     |     0
    A     |  5/7/19  |    ABC3    |      5      |    7     |    20

    B     |  5/6/19  |    ABC3    |     40      |    0     |     1
    B     |  5/7/19  |    ABC3    |      6      |    2     |     3

    A     |  5/6/19  |    QRE3    |      7      |    10    |     5

One Branch can ship two different products, which explains why there are two different A blocks (with different Product IDs) in the original and result dataframes.

I tried to use Pivot table, but what happens is that when one column, 'Shipped Out' has a value, the other two columns will have the value NaN. So there's only one column that will have a value while the other two will be filled with Nulls.

Note: I understand that this is similar to the "how to pivot a dataframe" question, but after reading it, and trying certain methods, I have only run into errors and am not sure what tactic would work.

Nik Tur
  • 53
  • 1
  • 9
  • Possible duplicate of [How to pivot a dataframe](https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe) – Quang Hoang Jul 03 '19 at 22:17
  • You are likely to get more responses if you format your input data in either code that someone can directly create a DataFrame out of, or paste the output from a juputer notebook session that can easily interpreted using pandas.read_clipboard. Also, why are your product ids different in the two Dataframes? – Karthik V Jul 03 '19 at 22:36
  • Try `unstack`: `df.set_index(list(df.columns[:-1])).Quantity.unstack('Dispense Type')` – Quang Hoang Jul 03 '19 at 22:40

1 Answers1

1

Since this is not a trivial pivotting, I will answer this question and not mark it as duplicate.

  1. First we pivot your data and sort_index by level=2 (which is Product ID).
  2. We flatten your multiindex columns with columns.get_level_values
  3. Reset index to get them back as columns and remove the column axis name with rename_axis
# Step1
piv = df.pivot_table(index=['Branch ID', 'Date', 'Product ID'], columns='Dispense Type').sort_index(level=2)

#Step2
piv.columns = piv.columns.get_level_values(1)

#Step3
piv = piv.reset_index().rename_axis(None, axis=1)
  Branch ID    Date Product ID  Received  Shipped Out  Transferred
0         A  5/6/19       ABC3         8           10            0
1         A  5/7/19       ABC3         7            5           20
2         B  5/6/19       ABC3         0           40            1
3         B  5/7/19       ABC3         2            6            3
4         A  5/6/19       QRE3         5            7           10
Erfan
  • 40,971
  • 8
  • 66
  • 78