5

Here is the code:

EDIT**** Please no more "it's not possible with unordered dictionary replies". I pretty much already know that. I made this post on the off-chance that it MIGHT be possible or someone has a workable idea.

#position equals some set of two dimensional coords
for name in self.regions["regions"]:  # I want to start the iteration with 'last_region'
    # I don't want to run these next two lines over every dictionary key each time since the likelihood is that the new
    # position is still within the last region that was matched.
    rect = (self.regions["regions"][name]["pos1"], self.regions["regions"][name]["pos2"])
    if all(self.point_inside(rect, position)):
        # record the name of this region in variable- 'last_region' so I can start with it on the next search...
        # other code I want to run when I get a match
        return
return # if code gets here, the points were not inside any of the named regions

Hopefully the comments in the code explain my situation well enough. Lets say I was last inside region "delta" (i.e., the key name is delta, the value will be sets of coordinates defining it's boundaries) and I have 500 more regions. The first time I find myself in region delta, the code may not have discovered this until, let's say (hypothetically), the 389th iteration... so it made 388 all(self.point_inside(rect, position)) calculations before it found that out. Since I will probably still be in delta the next time it runs (but I must verify that each time the code runs), it would be helpful if the key "delta" was the first one that got checked by the for loop.

This particular code can be running many times a second for many different users.. so speed is critical. The design is such that very often, the user will not be in a region and all 500 records may need to be cycled through and will exit the loop with no matches, but I would like to speed the overall program up by speeding it up for those that are presently in one of the regions.

I don't want an additional overhead of sorting the dictionary in any particular order, etc.. I just want it to start looking with the last one that it successfully matched all(self.point_inside(rect, position))

Maybe this will help a bit more.. The following is the dictionary I am using (only the first record shown) so you can see the structure I coded to above... and yes, despite the name "rect" in the code, it actually checks for the point in a cubical region.

{"regions": {"shop": {"flgs": {"breakprot": true, "placeprot": true}, "dim": 0, "placeplayers": {"4f953255-6775-4dc6-a612-fb4230588eff": "SurestTexas00"}, "breakplayers": {"4f953255-6775-4dc6-a612-fb4230588eff": "SurestTexas00"}, "protected": true, "banplayers": {}, "pos1": [5120025, 60, 5120208], "pos2": [5120062, 73, 5120257], "ownerUuid": "4f953255-6775-4dc6-a612-fb4230588eff", "accessplayers": {"4f953255-6775-4dc6-a612-fb4230588eff": "SurestTexas00"}}, more, more, more...}

  • 4
    Dictionaries are arbitrarily-ordered. If you want some kind of caching or optimization behavior, I think you'll need to look beyond the built-in types and functions. – TigerhawkT3 Jun 29 '15 at 00:15
  • I know they are arbitrarily ordered... that is the reason for my question. I know I could just directly access the likely possibility first by just self.regions["regions"]["""whatever the last region was"""] and then just checking the entire dictionary if that did not match, but It would be simpler (since they ARE arbitrary) if it could just start with the likely one first. I suspect any optimization or caching will be less useful because multiple users are accessing this same data simultaneously... besides making this all more complicated than it should be. –  Jun 29 '15 at 00:27
  • Can your *thing* have an attribute indicating the last region it was in - ```self.where_i_was_last``` or ```self.region_i_was_in_last_time_i_looked```?? – wwii Jun 29 '15 at 00:32
  • the line `# record the name of this region in variable- 'last_region' so I can start with it on the next search...` is where (for that user) i would be recording the region they were in... Since everyone is saying its just not possible to start with a particular key, I dont know how to do anything like that short of: ``` Search(lastkey): Found it: Do_foo_code Dangit: For Loop All: Search(All): Do_foo_code ``` –  Jun 29 '15 at 00:46
  • dangit for not being able to format that as code... :( –  Jun 29 '15 at 00:46

6 Answers6

2

You may try to implement some caching mechanism within a custom subclass of dict.

You could set a self._cache = None in __init__, add a method like set_cache(self, key) to set the cache and finally overriding __iter__ to yield self._cache before calling the default __iter__.

However, that can be kinda cumbersome, if you consider this stackoverflow answer and also this one.

For what it's written in your question, I would try, instead, to implement this caching logic in your code.

def _match_region(self, name, position):
    rect = (self.regions["regions"][name]["pos1"], self.regions["regions"][name]["pos2"])
    return all(self.point_inside(rect, position))


if self.last_region and self._match_region(self.last_region, position):
    self.code_to_run_when_match(position)
    return

for name in self.regions["regions"]:
    if self._match_region(name, position):
        self.last_region = name
        self.code_to_run_when_match(position)
        return
return # if code gets here, the points were not inside any of the named regions
Community
  • 1
  • 1
poros
  • 386
  • 3
  • 12
  • what is the purpose of the first condition `if self.last_region` in the `if self.last_region and self._match_region(self.last_region, position):` statement? My code will have defined it as None unless it found a previous match.. Is this just making sure it has a value? –  Jun 29 '15 at 03:06
  • Yep, exactly. You want to initialize/set it to `None` when you start and when you do not have a match (if this ever happens). – poros Jun 29 '15 at 09:50
1

That is right, dictionary is an unordered type. Therefore OrderedDict won't help you much for what you want to do.

You could store the last region into your class. Then, on the next call, check if last region is still good before check the entire dictionary ?

Noxy
  • 11
  • 1
  • That's what i am trying to do. I will _already_ know the last match... its a matter of figuring out how to get it to check that first. –  Jun 29 '15 at 00:57
1

Instead of a for-loop, you could use iterators directly. Here's an example function that does something similar to what you want, using iterators:

def iterate(what, iterator):
    iterator = iterator or what.iteritems()
    try:
        while True:
            k,v = iterator.next()
            print "Trying k = ", k
            if v > 100:
                return iterator
    except StopIteration:
        return None

Instead of storing the name of the region in last_region, you would store the result of this function, which is like a "pointer" to where you left off. Then, you can use the function like this (shown as if run in the Python interactive interpreter, including the output):

>>> x = {'a':12, 'b': 42, 'c':182, 'd': 9, 'e':12}
>>> last_region = None
>>> last_region = iterate(x, last_region)
Trying k = a
Trying k = c
>>> last_region = iterate(x, last_region)
Trying k = b
Trying k = e
Trying k = d

Thus, you can easily resume from where you left off, but there's one additional caveat to be aware of:

>>> last_region = iterate(x, last_region)
Trying k =  a
Trying k =  c
>>> x['z'] = 45
>>> last_region = iterate(x, last_region)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in iterate
RuntimeError: dictionary changed size during iteration

As you can see, it'll raise an error if you ever add a new key. So, if you use this method, you'll need to be sure to set last_region = None any time you add a new region to the dictionary.

celticminstrel
  • 1,637
  • 13
  • 21
  • Thank you for a productive suggestion! :) This looks workable.. just one additional question.. which things above are keywords and which are just variables? I like the concept, I just need to fully understand it first, versus just copy/past code.. So each user instance could be recording a different leave-off point, and pick that back up? ... and once a region is added or deleted, run code to clear each users pointer to none? –  Jun 29 '15 at 01:07
  • Yes, you can have multiple iterators pointing into the dictionary and pick up from whichever one you want. Also, depending on your exact requirements, you may want to save the value of the iterator _before_ calling `next()`, rather than after (which should be a simple change to make). – celticminstrel Jun 29 '15 at 01:14
  • I assume iter is an arbitrary variable name? that can be, say "lastrecord" instead? pycharm is telling me that "iter" shadows a builtin name "iter".. –  Jun 29 '15 at 01:28
  • In my example, `iter` is a variable name. Sorry for the confusion. I've edited to change it to `iterator`. – celticminstrel Jun 29 '15 at 01:30
  • not sure how easy this will be because my data is nested. Each region actually contains a dictionary itself, wherein is the coordinate data, represented by items "pos1" and "pos2". Typical record: "shop": {"flgs": {"breakprot": true, "placeprot": true}, "dim": 0, "placeplayers": {"4f953255-6775-4dc6-a612-fb4230588eff": "SurestTexas00"}, "breakplayers": {"4f953255-6775-4dc6-a612-fb4230588eff": "SurestTexas00"}, "protected": true, "banplayers": {}, "pos1": [5120025, 60, 5120208], "pos2": [5120062, 73, 5120257], "ownerUuid": "4f953255-6775-4dc6-a612-fb4230588eff"} –  Jun 29 '15 at 01:43
  • If you use something like `k,v = iterator.next()`, then `v` will hold that entire dictionary, so you can access `v['pos1']` and `v['pos2']`. Does that help? – celticminstrel Jun 29 '15 at 01:45
  • so in my sample record, k would be type(str) = "shop" and v would be assigned type(dict) = {"flgs": {"breakprot": true, "placeprot": true}, "dim": 0, "placeplayers": {"4f953255-6775-4dc6-a612-fb4230588eff": "SurestTexas00"}, "breakplayers": {"4f953255-6775-4dc6-a612-fb4230588eff": "SurestTexas00"}, "protected": true, "banplayers": {}, "pos1": [5120025, 60, 5120208], "pos2": [5120062, 73, 5120257], "ownerUuid": "4f953255-6775-4dc6-a612-fb4230588eff", "accessplayers": {"4f953255-6775-4dc6-a612-fb4230588eff": "SurestTexas00"}} ? –  Jun 29 '15 at 02:08
  • yes, I think so. Just want to make sure i'm clear on the datatypes –  Jun 29 '15 at 02:10
  • In your example, the first check of `last_region = iterate(x, last_region)` gets to `Trying k = c` and exits because c>100 (in my scenario, the region "matched"?).. but the next iteration it is trying b,d,e... shouldn't it keep searching c and exiting because it finds c>100? –  Jun 29 '15 at 02:26
  • See my second comment up at the top there. It shouldn't be a major change to repeat the last region before continuing through the iterator. However, my original idea for how to do that (make a copy of the iterator before calling `next()` and then returning that copy instead) won't work; I just tested that. So I guess you'd need to store both the name of the last region (to recheck it) and the iterator which will let you resume from where you left off if the most recent region no longer matches. – celticminstrel Jun 29 '15 at 02:52
  • This is not working.. the iterator points to the "next" record _after_ the match.. I want it to keep starting at the match. –  Jun 29 '15 at 02:56
  • Yes, that's correct; that's how an iterator works. If you want to repeat the last match, I think you'll probably have to track that separately. The iterator then allows you to continue where you originally left off if the last region no longer matches. – celticminstrel Jun 29 '15 at 02:59
  • Well, I thought I was clear, I want to pick up (start with) the last matching record, not the record after it. –  Jun 29 '15 at 04:17
  • Yeah, you were clear; I realized the limitation (that it only gives you the _next_ element) shortly after posting. I think it's still possible to tweak this to get what you want, but perhaps you'll find another method easier to understand. – celticminstrel Jun 29 '15 at 05:11
  • actually, I think I am going to abandon this whole approach. I am going to break up the whole map into predefined regions that can be calculated from the position and use that to index an array or other easily indexed structure that will contain the information I am looking up. –  Jun 30 '15 at 18:44
0

TigerhawkT3 is right. Dicts are unordered in a sense that there is no guaranteed order or keys in the given dictionary. You can even have different order of keys if you iterate over same dictionary. If you want order you need to use either OrderedDict or just plain list. You can convert your dict to list and sort it the way it represents the order you need.

Andrey
  • 59,039
  • 12
  • 119
  • 163
  • Can we just stop with the fact that they are not ordered? I know that already. I just want to to start with predefined key... I am even open to manually searching the first key, then doing the for loop over the rest if there is no match.. as long as it can be done cleanly. Also, I stated in the original question that I don't want the overhead of a sorted dictionary... besides, caching and sorting will be of little value since the same dictionary is being used by all users. –  Jun 29 '15 at 00:35
  • @SurestTexas I am not sure you fully grasp what does "not ordered" means. You get get a position of predefined key sure, but there is no "rest". Rest assumes that there is some linear order, there is a point and everything that comes after and before. Well that is not the case, you can iterate over dict with `iteritems()`, which sort of linearizes but the sequence of keys is not guaranteed. Have you actually measured overhead of OrderedDict? You can copy and sort dictionary, it will not affect other users. – Andrey Jun 29 '15 at 09:35
  • I know what it means Andrey; stop being condescending. I only need the "predefined position" for my purposes. if it does not match, then I will search all records with a for loop.. by rest, I mean I am going to search the whole thing and the order does not matter. –  Jun 30 '15 at 18:40
0

Without knowing what your objects are and whether self in the example is a user instance or an environment instance it is hard to come up with a solution. But if self in the example is the environment, its Class could have a class attribute that is a dictionary of all current users and their last known position, if the user instance is hashable.

Something like this

class Thing(object):
    __user_regions = {}
    def where_ami(self, user):
        try:
            region = self.__user_regions[user]
            print 'AHA!! I know where you are!!'
        except KeyError:
            # find region
            print 'Hmmmm. let me think about that'
            region = 'foo'
            self.__user_regions[user] = region

class User(object):
    def __init__(self, position):
        self.pos = position

thing = Thing()
thing2 = Thing()
u = User((1,2))
v = User((3,4))

Now you can try to retrieve the user's region from the class attribute. If there is more than one Thing they would share that class attribute.

>>> 
>>> thing._Thing__user_regions
{}
>>> thing2._Thing__user_regions
{}
>>> 
>>> thing.where_ami(u)
Hmmmm. let me think about that
>>> 
>>> thing._Thing__user_regions
{<__main__.User object at 0x0433E2B0>: 'foo'}
>>> thing2._Thing__user_regions
{<__main__.User object at 0x0433E2B0>: 'foo'}
>>> 
>>> thing2.where_ami(v)
Hmmmm. let me think about that
>>> 
>>> thing._Thing__user_regions
{<__main__.User object at 0x0433EA90>: 'foo', <__main__.User object at 0x0433E2B0>: 'foo'}
>>> thing2._Thing__user_regions
{<__main__.User object at 0x0433EA90>: 'foo', <__main__.User object at 0x0433E2B0>: 'foo'}
>>> 
>>> thing.where_ami(u)
AHA!! I know where you are!!
>>> 
wwii
  • 23,232
  • 7
  • 37
  • 77
0

You say that you "don't want an additional overhead of sorting the dictionary in any particular order". What overhead? Presumably OrderedDict uses some additional data structure internally to keep track of the order of keys. But unless you know that this is costing you too much memory, then OrderedDict is your solution. That means profiling your code and making sure that an OrderedDict is the source of your bottleneck.

If you want the cleanest code, just use an OrderedDict. It has a move_to_back method which takes a key and puts it either in the front of the dictionary, or at the end. For example:

from collections import OrderedDict

animals = OrderedDict([('cat', 1), ('dog', 2), ('turtle', 3), ('lizard', 4)])

def check_if_turtle(animals):
    for animal in animals:
        print('Checking %s...' % animal)
        if animal == 'turtle':
            animals.move_to_end('turtle', last=False)
            return True
    else:
        return False

Our check_if_turtle function looks through an OrderedDict for a turtle key. If it doesn't find it, it returns False. If it does find it, it returns True, but not after moving the turtle key to the beginning of the OrderedDict.

Let's try it. On the first run:

>>> check_if_turtle(animals)
Checking cat...
Checking dog...
Checking turtle...
True

we see that it checked all of the keys up to turtle. Now, if we run it again:

>>> check_if_turtle(animals)
Checking turtle...
True

we see that it checked the turtle key first.

jme
  • 19,895
  • 6
  • 41
  • 39
  • That seems fine, but I have two follow-up questions: –  Jun 29 '15 at 01:51
  • 1) what happens when 20 different users are putting their region first on the list? - finding it on a 20th iteration is still better than 100+, I suppose 2) can regular dictionary methods in other part of the code still access it, or do I have to rewrite the entire program in places where a regular dictionary method is used.... I'm asking because that is a weird format; `([('cat', 1), ('dog', 2), ('turtle', 3), ('lizard', 4)])` versus `{'cat': 1, 'dog': 2, 'turtle': 3, 'lizard': 4}` –  Jun 29 '15 at 01:57
  • @SurestTexas First, to answer (2), yes, an `OrderedDict` can be substituted anywhere a `dict` can be used without rewriting code. It is a subclass of `dict`, and thus has all of the `dict` methods. Typically one would initialize an `OrderedDict` in the way I did, but in fact it doesn't matter in this case: you can initialize it with a `dict` if you like. Just know that the order of the keys in the resulting `OrderedDict` will not necessarily be the same as the way you write them in the `dict` used in the initializer. – jme Jun 29 '15 at 02:51
  • As to (1), I'm not quite sure what you mean by "different users". I suppose you're saying that between different calls to the method, different users may be looking for different regions? What is the method's signature? Is there a reason you don't just check the suspected region first, then iterate through the others? – jme Jun 29 '15 at 02:53
  • I mean there is only 1 dictionary and each of 20 users may be interacting with it every second (or more) as they occupy different positions in the grid.. the suspected region is different for each user/player –  Jun 29 '15 at 04:15