Setup:
Visualize a large array of numbers where each number represents a height of a bar on a bar graph.
Ex: [5, 4, 3, 7, 2, 3, 1, 12]
█
█
█
█
█
█ █
█ █
█ █ █
██ █ █
████ █ █
██████ █
████████
Analyze:
This is the bar graph of the previous numbers. What I need to find is the area containted in the graph in number of open (or unfilled) units.
Solution (Pt.1):
To go about this I made an algorithm to calculate all the peaks in the array.
This returns : [5, 7, 3, 12]
as well as another list with the indices of each entry, [0,3,5,7]
To us, there are only three important peaks to find the area. The 5
, the 7
, and the 12
.
We can then break it down like this.
The amount of open area in between the 5 and 7 is (general rule):
(([Index Of Larger] - [Index Of Smaller] - [1])*[SmallerValue]) - [Values Of All In B/W]
So the area of the first section would be (2*5) - (4+3)
or 10-7
or 3
. THis makes sense because if you look at the graph you see there is an empty L shaped section that you could fit 3 units of say, water in without it overflowing.
If you repeat this with the second section you get its correct area as well.
My issue is developing an algorithm to go from ALL PEAKS
to the IMPORTANT PEAKS
.
Misleading:
In this case it is extremely easy to see how that could be done. You simply write an algo to find that the 3
is smaller than the 7
and the 12
so get rid of it and return a refined version of the peaks.
However it is not always that simple.
More Difficult Example:
I have an array:
[5, 4, 3, 7, 2, 3, 1, 12, 9, 10, 5, 3, 6, 8, 5, 6, 4, 7, 6, 9, 4, 11, 11, 4, 1, 2, 1]
Running it through a basic custom peak finding algorithm O(N)
It returns:
[5, 7, 3, 12, 10, 8, 6, 7, 9, 11, 11, 4, 2]
In this example, we see the same problem in the first part of this question, however, after the 12
in this peak list, a human can easily see that the next most important peak to look at are the two 11s
, the 4
, and 2
. So I need a way to go from:
[5, 7, 3, 12, 10, 8, 6, 7, 9, 11, 11, 4, 2]
To:
[5, 7, 12, 11, 11, 4, 2]
The above array is a list of the 'important' peaks that are necessary to find the area and again visualize the open blocks as if they'd contain water or something so that they are limited to the lowest immediate peak before overflowing.
To better visualize this more full, second example I have a picture of the graph and all of its peaks and data points here.
Thank you.