0

I have a pandas dataframe that has four fields 'EventDate', 'DataField', 'DataValue'.

'DataField' has three values i.e Oxygen, HeartRate, HeartRateVariability.

enter image description here

How to change the above format into following format for analysis?

enter image description here

James Z
  • 12,209
  • 10
  • 24
  • 44
VijayS
  • 25
  • 4
  • Does this answer your question? [How to pivot a dataframe?](https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe) – Henry Ecker Jun 11 '21 at 19:01

3 Answers3

1

you can try pivot :

df = df.pivot(*df).fillna('')

For more info -> you can check pivot & pivot_table

Nk03
  • 14,699
  • 2
  • 8
  • 22
  • The use of `*` is really interesting in your answer. Can you please tell me how using the `*` works ? – Shivam Roy Jun 11 '21 at 20:09
  • 1
    If the columns are in order i.e `index/columns/values`. Then we can directly use `*df` and it’ll use the `1st` column as `index`, `2nd` column as `columns` and `3rd` column as `values` – Nk03 Jun 11 '21 at 20:32
1

Any time you want to take one attribute in your dataset and group some other attributes by it, you should think about using pandas group_by or pivot_table functionality.

I'm personally a fan of pivot tables, so here is how do it in a pivot table:

# Pivot the data
pivot_table = df.pivot_table(
    index=['EventDate'],
    values=['Oxygen', 'HeartRate', 'HeartRateVariability],
    aggfunc={'Oxygen': 'mean', 'HeartRate': 'mean', 'HeartRateVariability': 'mean'}
)

By specifying the aggfunc as mean, if there are any EventDates that have multiple records, the resulting pivot table will have the mean of those records listed.

If creating pivot tables is something you do often, you could also checkout some pandas pivot table GUIs. I'm the creator of one called Mito. Mito is an extension to Jupyter Lab and it lets you create pivot tables (and other spreadsheet style analyses) in a point and click way. Each time you make an edit in the Mito spreadsheet, it automatically generates the equivalent pandas code for you.

0

As suggested by others, you can use the pivot_table() function, specifically for your case you can try this:

pivot_df = df.pivot_table( index = 'EventDate', columns = 'DataField', values = 'DataValue')
Shivam Roy
  • 1,961
  • 3
  • 10
  • 23
  • Getting following error from using above function ValueError: Index contains duplicate entries, cannot reshape – VijayS Jun 11 '21 at 19:40
  • My apologies, I had tried the code in VS Code, creating a sample DataFrame similar to yours, it had worked. I have edited my answer, and used `pivot_table` instead of `pivot`. I hope it works for you. – Shivam Roy Jun 11 '21 at 20:07
  • `Pivot_table` allows duplicate index.. – Nk03 Jun 11 '21 at 21:35