I have read pandas: how to run a pivot with a multi-index? but it could not solve my problem.
Given the data frame below:
import pandas as pd
df = pd.DataFrame({
"date": ["20180920"] * 6,
"id": ["A123456789"] * 6,
"test": ["a", "b", "c", "d", "e", "f"],
"result": [70, 90, 110, "(-)", "(+)", 0.3],
"ref": ["< 90", "70 - 100", "100 - 120", "(-)", "(-)", "< 1"]
})
I'd like to spread the test
column, use the values in result
, and ignore ref
. In other words, the desired output is like:
date id a b c d e f
0 20180920 A123456789 70 90 110 (-) (+) 0.3
So I tried df.pivot(index=["date", "id"], columns="test", values="result")
, but it failed with ValueError: Length of passed values is 6, index implies 2. I think it is related to "If an array is passed, it must be the same length as the data." in pivot_table
documentation, but I just don't understand what it means. Can someone elaborate that please?
BTW, I finally get my desired output by df.drop(columns="ref").set_index(["date", "id", "test"]).unstack(level=2)
. Is it the only correct way?