1

I would like to produce a random data of baby sleep time, but I want the random data to behave similarly (not necessarily equally) to the following graph:

enter image description here

(This is just an imaginary data, please don't conclude anything from this, specially not when your baby should sleep...)

The output that I want to produce is something like:

Baby name    Sleep start         Sleep end
Noah         2016/03/21 08:38    2016/03/21 09:28
Liam         2016/03/21 12:43    2016/03/21 15:00
Emma         2016/03/21 19:45    2016/03/22 06:03

So I thought I will create a weights table of time of day and weight (for the chance that a baby will sleep).

The question is how would I generate from this weights table a random data of a range of time to sleep?

(Think about if a baby start to sleep at around 8am, most likely he/she will wake in the next two hours and not continue to sleep more, and almost certainly won't sleep till 7am).

Is there another way you would build this (without the weights table)?

I prefer to build this in Python(3), but I would appreciate the general algorithm or lead to the solution.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Gluz
  • 3,154
  • 5
  • 24
  • 35
  • 4
    I'm not sure I understand the question. Do you actually have probability data on when baby's sleep, and you want to generate random sleep periods to match? Or are you making everything up from scratch? If the latter, how can you possible know what's right? – Blckknght Mar 21 '16 at 21:37
  • I'm making everything up. What's right, is what I decide is right when it is unrelated to real data, isn't it? The decision of what is right is the weights table (as I see it), from there it is irrelevant if the data is true or not, the algorithm should mock the behaviour. – Gluz Mar 21 '16 at 22:03
  • @Gluz If anything you make up will be right, then what is your question? How can any approach you take be wrong? – Kevin Mar 21 '16 at 22:04
  • @KevinWells, the only thing I make up is how the data should be look like. The right answer will create the data in a way that if I plot it, it will look approximately the same like the one I've plotted. – Gluz Mar 22 '16 at 08:13

2 Answers2

0

Given the weights table data, you could use numpy.random.choice:

np.random.choice(list_of_times, 
                 num_babies, 
                 p=list_of_weights_for_those_times)

Without using a weights table, you would need to find the function that describes your distribution. Then see the answer to this question.

Community
  • 1
  • 1
emmagordon
  • 1,222
  • 8
  • 17
0

Let me start with answering the reverse of your question, since I misunderstood it; but it gave me the answer too.

Assume that, you already have a list of intervals dispersed around the 24 hours. You would like to find a number of the intervals that overlaps any given minute of the day; which you refer to as weight.

I can think of two approaches. But, first you should convert your time intervals into minutes, so the times in your list becomes:

# Note the 19:45-06:03 has been split into two intervals (1185,1440) and (0,363)
st = sorted(list(to_parts(sleep_table))
>>> [(0, 363), (518, 568), (763, 900), (1185, 1440)]

First, a simple solution will be to convert all intervals into a bunch of 1s and sum over all the intervals:

eod = 60*24
weights = reduce(lambda c,x: [l+r for l,r in zip(c, [0]*x[0] + [1]*(x[1]-x[0]) + [0]*(eod-x[1]))] ,st,[0]*eod)

This will give you a list of size 1440, where each entry is the weight for a given minute of the day.

Second, is a tiny bit more complex line sweep algorithm, which will give you the same values in O(nlogn) time for n segments. All you need is to just take the start and end times of the intervals and sort them, while keeping track of whether a time is a start or end time:

def start_end(st):
    for t in st:
        yield (t[0],1)
        yield (t[1],-1)
sorted(list(start_end(st)))
#perform a line sweep to find changes in the weights
map(lambda (i,l):(i,sum(map(itemgetter(1),l))),groupby(sorted(list(start_end(st))), itemgetter(0)))
#compute the running sum of weights
#See question 35605014 for this part of the answer

Now, if you start from the weights themselves. You can easily convert it into a list of starts and end times, which are not coupled into intervals. All you need to do is to convert the smooth spline in the post into a step function. Whenever the step function increases in value, you add a sleep start time, and whenever it goes down you add a sleep stop time. Finally you perform a line sweep to match the sleep start time to a sleep end time. There is a bit of wiggle room here; as you can match any start time with any end time. If you want more data points, you can introduce additional sleep start and end times, as long as they are at the same point in time.

topkara
  • 886
  • 9
  • 15