2

This is kind of a weird problem and I'm not entirely sure how to ask it appropriately but I'll give it my best shot.

I have a custom class that is basically a wrapper for an API that updates an SQLite database with new data on each call (I can't add it to the question because it's massive and private).

What's weird is that it seems some information is being cached (I don't think this is possible but that's the only thing it reminds me of, like when you make edits in web dev and they don't update) because it works the first time, but when I try to reinitialize the object and run it again, it doesn't add any new data (when I know there is new data to be added) to the DB.

I know the code works because if I restart the kernel and run it again, it updates no problem.

I've tried deleting the object (del InitializedClass), re-initializing, and initializing with different values but nothing seems to work. It won't update the DB unless the kernel is restarted.

Has anyone ever had an issue like this? I'm happy to provide more information if this isn't enough but I don't know how else to describe it.

Thank you!!


EDIT

The below psuedocode is basically exactly what is happening

from something import SomeClass    

while True:

    obj = SomeClass() #      <---------  How can I "reset" this on each loop?

    obj.get_new_data_from_api()
    obj.update_raw_db()
    obj.process_raw_data()
    obj.update_processed_db()

    # i tried different combinations of deleting the object
    del obj
    del SomeClass
    from something import SomeClass

EDIT 2:

So as everyone mentioned, it was an issue with the class itself, but I still don't really understand why the error was happening. Basically, the end argument was not being updated (I thought it would have updated to the current time each time it was called) when I made the datetime.now() function call as the default kwarg (even after deleting the class and creating a new instance, this did not update). The issue is illustrated below:

class SomeBrokenClass():

    def __init__(self):    
        pass

    def get_endpoint(self, start, end):
        return 'https://some.api.com?start_date=%s&end_date=%s' % (start, end)

    # THE PROBLEM WAS WITH THIS METHOD ( .get_data() ):
    # When re-initializing the class, the `end` argument
    # was not being updated for some reason. Even if I completely
    # delete the instance of the class, the end time would not update.

    def get_data(self, start, end = int(datetime.now().timestamp() * 1000)):
        return pd.read_json(self.get_endpoint(start, end))

    def get_new_data_from_api(self):
        start_date = self.get_start_date()
        df = self.get_data(start_date)
        return df


class SomeWorkingClass():

    def __init__(self):    
        pass

    def get_endpoint(self, start, end):
        return 'https://some.api.com?start_date=%s&end_date=%s' % (start, end)

    def get_data(self, start, end):
        return pd.read_json(self.get_endpoint(start, end))

    def get_new_data_from_api(self):
        start_date = self.get_start_date()
        end_date = int(datetime.now().timestamp() * 1000) # BUT THIS WORKS FINE
        df = self.get_data(start_date, end_date)
        return df
Zach
  • 1,243
  • 5
  • 19
  • 28
  • it's likely to be your code which we can't see, so not sure how you expect help with that? – Mitch Wheat Feb 03 '18 at 01:28
  • But why would it be the code if it works correctly each time the kernel is restarted? – Zach Feb 03 '18 at 01:33
  • "But why would it be the code if it works correctly each time the kernel is restarted?" - that I don't know the answer to, but it's likely a bug in your code. I suggest you reduce code to a working example that exhibits the problem. – Mitch Wheat Feb 03 '18 at 01:38
  • @MitchWheat okay I tried updating with what's basically going on – Zach Feb 03 '18 at 01:38
  • @MitchWheat I guess what I'm primarily asking is, shouldn't deleting the object and then reinitializing it do the same thing as restarting the kernel? is it possible for information to be cached in a notebook? – Zach Feb 03 '18 at 01:45
  • The pseudocode you've shown appears to mix up a module that you're importing with a class inside the module (that you call to create an object). If you're not actually creating an object, just accessing the module, then the issue you describe is not unexpected, since modules are cached. If you *are* creating an instance of a class, then the issue is with the implementation of the class, which you haven't shown. – Blckknght Feb 03 '18 at 02:45
  • @Blckknght ya sorry i screwed that up in the example. I am actually creating an instance of a class. But my question still stands -- how can it be an issue with the implementation of the class if it works as it is supposed to each time i restart the notebook? How can I fully delete the instance and "restart" it on each loop? – Zach Feb 03 '18 at 04:37

2 Answers2

5

Your issue has to do with the default value for a parameter in one of your methods:

def get_data(self, start, end = int(datetime.now().timestamp() * 1000)):
    ...

The default value is not recalculated each time the function is called. Rather, the expression given as the default is evaluated only once, when the method is defined, and the value is stored to be used as the default for all later calls. That doesn't work right here, since it evaluates datetime.now only at the time the module was loaded, not each time the function is called.

A common way to fix this is to set a sentinel value like None as the default, and then calculate the appropriate value inside the function if the sentinel is found:

def get_data(self, start, end=None):
    if end is None:
        end = int(datetime.now().timestamp() * 1000)
    ...
Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • I was just about to write this answer myself. Kudos. – cco Feb 03 '18 at 06:02
  • Thank you! To me this seems like a rather odd quirk. You would think that on each new function call, all code is reevaluated. *Especially* when a new instance of the class is created. But I suppose it makes sense if it is set on module load. Very interesting stuff! I've literally spent about 18 hours trying to figure out what was wrong haha thanks again for the help :) – Zach Feb 03 '18 at 06:09
  • 1
    Programmers that are new Python often first encounter this kind of issue when they try to use a mutable default value (like an empty list) for an argument that they later modify in place (e.g. by calling the `append` menthod of the list). A function declared like `def foo(x=[])` always uses the same list as the default argument, so calls after the first will see the values added to the list on all the previous calls. The situation you ran into is not as common, but it comes from the same root cause. – Blckknght Feb 03 '18 at 06:36
2

You're not "deleting the object and then reinitializing it" - you're removing the module from the global namespace and then adding it back. This does not re-execute the module's code:

# test.py
print("Hi!")

>>> import test
Hi!
>>> del test
>>> import test
<Nothing printed>

If you want to reload your module, you need to do so explicitly, as in this question:

>>> import importlib
>>> importlib.reload(test)
Hi!
<module 'test' from '/Users/rat/test.py'>

(edited to add the following) However, what you're trying to do here should never be necessary. You should never need to delete a class before creating a new instance of it. If reloading your module like this helps, the only reasons I can think of are these:

  1. The behavior that's not happening the second time you create a SomeClass instance is actually caused by code at the top level of the something module - that is, outside of any function or class definition, or
  2. SomeClass is recording something in its own class attributes and opting not to do something the second time it's instantiated.

In either of these cases the approach I'd take would be to find the code that's only executing once and extract it into a function so you can call it directly if you need to. Sorry for the vagueness, but without your code it's hard to be more precise. My bet would be on the first, and there are probably other scenarios, but this at least might give you a start.

Nathan Vērzemnieks
  • 5,495
  • 1
  • 11
  • 23
  • Thanks! This works, but it seems a little sloppy to reload the module on each iteration (I was just trying that as a last ditch effort because I wasn't sure what else to do). I still don't understand why running `obj = SomeClass()` doesn't reinitialize / restart / recreate the class / object. I always thought that each time you do that, it overwrites that variable with a new instance of the class (*especially* when you use `del` beforehand). Is this not correct? Thanks again for your help btw – Zach Feb 03 '18 at 04:53
  • There's never a need to delete a variable before reassigning it, or to delete a class before creating a new instance of it. If your code works when you reload the module but not otherwise, there has to be something funny in your `something` code. If you supply that, we can pin it down further. – Nathan Vērzemnieks Feb 03 '18 at 05:19
  • I just saw your "massive and private" comment in your edit, so I guess that option is out. I've edited my answer to provide some more ideas for how to understand what's going on. And to be clear - it's more than "a little sloppy" to work this way, it's totally weird. I've been using python for almost twenty years and I've never had to do anything like it. I'm certain it's not necessary, if you can figure out what's going weird in your code. – Nathan Vērzemnieks Feb 03 '18 at 05:48
  • Thanks for your help, I really appreciate it! I actually figured it out and added it to my second edit. It was a problem with the class. For some reason the `datetime.now()` timestamp was not being updated on each iteration (even if I completely delete the instance!). Is that a quirk? Where function calls are not updated when they are included as a default for a kwarg? – Zach Feb 03 '18 at 05:55
  • 1
    Depends on what you mean by "quirk". A function argument can only have one default! The default is set when the function is created. The other answer has more details. – Nathan Vērzemnieks Feb 03 '18 at 06:06
  • I had a similar problem. Instead of using `from module import class`, I just used `import module` and then called the class using `module.class`. This seemed to correct any re-initialization issues for me. – FreyGeospatial Sep 14 '22 at 17:43