8

Inside an autoregressive continuous problem, when the zeros take too much place, it is possible to treat the situation as a zero-inflated problem (i.e. ZIB). In other words, instead of working to fit f(x), we want to fit g(x)*f(x) where f(x) is the function we want to approximate, i.e. y, and g(x) is a function which output a value between 0 and 1 depending if a value is zero or non-zero.

Currently, I have two models. One model which gives me g(x) and another model which fits g(x)*f(x).

The first model gives me a set of weights. This is where I need your help. I can use the sample_weights arguments with model.fit(). As I work with tremendous amount of data, then I need to work with model.fit_generator(). However, fit_generator() does not have the argument sample_weights.

Is there a work around to work with sample_weights inside fit_generator()? Otherwise, how can I fit g(x)*f(x) knowing that I have already a trained model for g(x)?

today
  • 32,602
  • 8
  • 95
  • 115
user1050421
  • 127
  • 1
  • 9
  • 1
    Have you tried index slicing? I.e. if you have 23000 prices you can simply slice each 5th one of these using `data[0:23000:5, :, :]`. The returned array will have shape `(4600, 45, 41)` – Attack68 Nov 17 '18 at 20:04

1 Answers1

15

You can provide sample weights as the third element of the tuple returned by the generator. From Keras documentation on fit_generator:

generator: A generator or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. The output of the generator must be either

  • a tuple (inputs, targets)
  • a tuple (inputs, targets, sample_weights).

Update: Here is a rough sketch of a generator that returns the input samples and targets as well as the sample weights obtained from model g(x):

def gen(args):
    while True:
        for i in range(num_batches):
            # get the i-th batch data
            inputs = ...
            targets = ...
            
            # get the sample weights
            weights = g.predict(inputs)
            
            yield inputs, targets, weights
            
            
model.fit_generator(gen(args), steps_per_epoch=num_batches, ...)
    
    
Community
  • 1
  • 1
today
  • 32,602
  • 8
  • 95
  • 115
  • Can you build a little snippet to show me how it works? – user1050421 Nov 29 '18 at 13:41
  • @user1050421 Do you want to know how to define a data generator for a Keras model or how to return sample weights in a generator? – today Nov 29 '18 at 13:42
  • @user1050421 You can find an example of `Sequence` based generator [here](https://keras.io/utils/#sequence). – today Nov 29 '18 at 13:46
  • Are you up to explain both? – user1050421 Nov 29 '18 at 13:47
  • @user1050421 Sure, I added an update. Please take a look. [This tutorial](https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly) might also help (It is using `Sequence` based generators, not Python generators). – today Nov 29 '18 at 13:55
  • Ok, thanks for your help! If you can upvote my question, I will give you this bounty – user1050421 Nov 29 '18 at 14:00
  • @today what about with evaluate_generator? I can't seem to get the same logic you use here to work. – StatsSorceress May 20 '19 at 14:38
  • @StatsSorceress I have not tried that, but according to the [documentation](https://keras.io/models/sequential/#evaluate_generator) it must be the same, i.e. you can return the sample weights as the third element of the tuple from the generator. – today May 20 '19 at 18:38
  • @today Aha! The third - not the fourth? I'm sorry, I read the docs that you linked but I still don't understand because the docs say that it outputs a list of scalars, but doesn't say what each value at each position must represent....I think I'm making this too hard? – StatsSorceress May 20 '19 at 19:03
  • @StatsSorceress That's a different question. They are loss value and the metric values returned for each output (if the model has multiple outputs). [This answer](https://stackoverflow.com/a/51303340/2099607) might help you to understand it better. – today May 20 '19 at 19:23
  • I have a similar problem where I need the sample_weights from the generator in my custom loss function...this post helped me https://stackoverflow.com/questions/57999225/how-to-access-sample-weights-in-a-keras-custom-loss-function-supplied-by-a-gener – Tarun Oct 26 '19 at 23:32