1

Apologies for the somewhat subjective question, but I was wondering if anyone could summarize the relative pros and cons of using GroupBy vs hierarchical indexing for working on groups of data within a single dataframe.

My project involves constructing a dataframe of typing data. Each row represents a single keystroke (letter, touch/release time, touch/release coordinates), being then organized by sentences (the first level of grouping), and then by day (the second level). I need to both select sentences by an index (day/sentence), as well as perform operations on the entire sentence group (such as mapping functions to these groups, as opposed to individual rows).

It seems that both GroupBy and hierarchical indexing have some benefits -- GroupBy seems to make it easier to perform calculations on the group level, while hierarchical indexing makes it easier to pluck specific phrases out of the dataframe for other purposes. Can anyone advise on which technique would be more appropriate? Or is some combination of the two the best way to go?

Thank you!

Edit: this question was marked as a duplicate of a question regarding working with hierarchical indexing, but it is not -- this question is concerned with hierarchical indexing vs another approach.

kronosapiens
  • 1,333
  • 1
  • 10
  • 19
  • I'm not completely clear on your question. If a MultiIndex is a convenient way to represent your data, why not use that? You can still use GroupBy to operate on groups with the `level` parameter. – chrisb Jun 12 '14 at 17:39
  • @chrisb, you're right. I hadn't realized that I could use `level` to group by levels on a MultiIndex. I think that's the way to go. – kronosapiens Jun 12 '14 at 18:10
  • Possible duplicate of [Splitting Dataframe with hierarchical index](https://stackoverflow.com/questions/56931768/splitting-dataframe-with-hierarchical-index) – sophros Jul 08 '19 at 10:12
  • A possible duplicate of: [Splitting dataframe into multiple dataframes](https://stackoverflow.com/questions/19790790/splitting-dataframe-into-multiple-dataframes) – sophros Jul 08 '19 at 10:12

0 Answers0