0

I have a dataframe with time series of categorical values. For a toy example lets say we have o and v series. The times may overlap but are not guaranteed to. How can I use a data frame to select the value of each measure at the maximum measurement time?

Here's what the data "looks" like:

           9  |                
              |                               
              |                   vv             
           6  |           vvvvvvvv  vv        
              |         vv            vv      
 measure      |       vv    ooo         v        
           3  |      v     o   oo  oo    vv   
              |   vvv   ooo      oo        vvv  
              | vv                   
           0  +------------------------------------
                            time

Here's a data-frame that represents the data above (obviously incomplete).

series   time   measure
 v       25    1.0
 v       26    1.1
 o       32    2.2
 o       33    2.0
 v       28    1.9
...

I'm honestly completely lost here, I've read the docs and they aren't clear on situations like this. Documented aggregation functions seem to act on a series not on a "row".

Using the data graphed above, I should get:

  series  max_measurement
   v       2.0
   o       3.0

Edit: this is NOT a duplicate of the linked question. That is simply a multiple aggregate issue. This is an aggregate and selection issue.

user1276560
  • 135
  • 1
  • 6
  • 1
    Did you just use ASCII to plot a graph? :D – cs95 Jan 14 '18 at 19:12
  • Can you supply the data instead of the graph and expected output? – Scott Boston Jan 14 '18 at 19:13
  • So, you want `df.groupby('series').measure.max()` – cs95 Jan 14 '18 at 19:13
  • Scott: the data shown is just used to illustrate the problem simply. It doesn't exist. I drew the plot manually and just made up plausible numbers for the dataframe. That said. The structure is intentionally very simple to that the exact issue can be fixed with the minimal possible amount of code. The real problem is a log extract – user1276560 Jan 14 '18 at 20:30
  • @COLDSPEED the solution you listed just finds the maximum of the measure which in this case will be near 7 for 'v' and near 5 for 'o'. – user1276560 Jan 14 '18 at 20:43

0 Answers0