2

I have a dataframe with two columns Distance(m) and height(m). I want to calculate the max, min and average height values from an interval of 0.04439 m of distance.

Distance is a continuous series from 0 to 0.81m each 0.00222m with a total of 403 values length.

The aim is to extract 18 values (max min average) of Height from 18 intervals each 0.0439m distance (the continuous distance series between 0 and 0.81m)

Then, create a dataframe (2 columns) of each distance interval and its respectively max min and avg value of height

this is an example:

Interval distance     Height_max(m)     Height_min(m)     Height_average(m)

1                       0.35            0.15           0.25  

2                       0.55            0.22           0.35  

3                       0.25            0.10           0.15

I have only 2 columns in my dataframe:

Distance(m) = [0, 0.0022, 0.0044, .... 0.81 ]
Height(m) = [ 0, 0.1, 0.5, 0.4, 0.9, .... 0.1]

Does anyone have any suggestions that can help me?

Thanks!

help-ukraine-now
  • 3,850
  • 4
  • 19
  • 36
  • Hi. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Jul 09 '19 at 11:15
  • Thanks! I have this code: Distance(m) = [0, 0.0022, 0.0044, .... 0.81 ] Height(m) = [ 0, 0.1, 0.5, 0.4, 0.9, .... 0.1] – Master03 Skywalker Jul 09 '19 at 12:09
  • added answer, not 100% sure if understand, if some problem, let me know or also better add some sample data to question with expected output. – jezrael Jul 09 '19 at 12:17

1 Answers1

2

I believe you need cut for bining column by intervals and then aggregate by GroupBy.agg with list of aggregation functions:

d = pd.cut(df['Distance'], [0, 0.0022, 0.0044, .... 0.81 ])
h = pd.cut(df['Height'],  [0, 0.1, 0.5, 0.4, 0.9, .... 0.1])

df.groupby([d, h])['Height'].agg(['min','max','mean'])
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • ok.. but I want to extract 18 values from 18 ranges of 0.0439 m, which means between 0 to 0.0439 ; 0.0439 to 0.0878; ... up to 0.7661 to 0.81 – Master03 Skywalker Jul 09 '19 at 12:18
  • 1
    @Master03Skywalker `np.arange` or `np.linspace`? – Dan Jul 09 '19 at 12:21
  • 1
    @Master03Skywalker - So need grouping by both intervals? like in edited answer? – jezrael Jul 09 '19 at 12:22
  • I want to try this bins = [0, 0.0439, 0.0878 ..... 0.7902] df['min'] = pd.cut(df.height.min, bins) # to min values Am I in the correct path? – Master03 Skywalker Jul 09 '19 at 13:32
  • 1
    @Master03Skywalker - What data are used for binning? because `pd.cut` binning some another data by intervals from list - here by `[0, 0.0022, 0.0044, .... 0.81 ]` and `[0, 0.1, 0.5, 0.4, 0.9, .... 0.1]`, maybe help [this](https://stackoverflow.com/questions/45273731/binning-column-with-python-pandas/45273750#45273750) – jezrael Jul 09 '19 at 13:34
  • Thanks!. I used the distance data to stablish the intervals and then calulate max min avg values of height – Master03 Skywalker Jul 09 '19 at 16:58