0

I was wondering if anybody could help. If I had a data set containing two columns of date and river flow, how could I obtain the top 100 largest values of river flow, with the condition of having at least a duration of XX days (e.g. 14 days) between each "peak" (i.e. two values which fall within two weeks of each other would only count as 1 peak).

Date Q
01/01/1990 24
02/01/1990 18
03/01/1990 40

I started by ranking all values and then picking out each peak and manually calculating if the next peak fell outside the 14 day period but I was wondering if this could be performed using a formula. Thanks.

  • So if you had say two values which were high values but within 14 days of each other, would you take the highest one and ignore the other one? – Tom Sharpe Jan 22 '21 at 12:10
  • Yes, which means I couldn't just use the largest 100 values – Jamie Towner Jan 22 '21 at 12:27
  • I think you would need an iterative approach, something like 1. Choose the highest value. 2. Choose the highest value not within 14 days of a value which has already been chosen. 3. Repeat 99 times from 2. But there are lots of ways of finding peaks in (presumably noisy) data and that might not be the best. It could be done in excel (which I am familiar with) but r might be easier. – Tom Sharpe Jan 22 '21 at 13:10
  • I think you might find a number of approaches to this; and you may want to consider posting on [Cross Validated](https://stats.stackexchange.com/) to get additional suggestions. In R you could look at packages like `pracma` that have functions like `findpeaks` where you can indicate a minimum distance between peaks in a time series...or you can adapt other functions available to meet your needs...generally, you might get more specific assistance here if you provide a code attempt... – Ben Jan 22 '21 at 22:03

0 Answers0