0

I need a collection of elements the only purpose of which is to store elements for futher enumeration, so the elements don't have to be ordered/indexes in any way. But a simple list is not enough, because I may need to remove any element from the collection, and I need to do it in constant time. deque or other implementation of linked list as data structures would be sufficient for this purpose, because deletion of an element in linked list is a constant time operation. The problem is that python does not provide a way to track a specific node of a linked list (at least I don't know how), so if I need to remove previously added element from a list, first I need to find it, which is linear time operation.

In C++, there is a std::list with its std::list::iterator that does exactly what I need. The iterator holds the pointer to the node in the linked list. When I add an element to a linked list, I can store the iterator of the just added element and further use it to remove the element from the list in constant time, since I already have a pointer to the linked list node with the pointer to previous and next elements. It also doesn't matter whether and how many elements were added or deleted before or after the element.

So how can it be done in python? Is there some alternative to the C++'s std::list::iterator?

I also want to notice that I can implement a linked list in python myself, but I'm afraid it would be slower than the implementation written in C. And I also don't want to reinvent the wheel.

g00dds
  • 195
  • 8
  • What do you mean with "futher enumeration"? Are the elements hashable? Are there duplicates? – Kelly Bundy Apr 28 '23 at 21:14
  • In C++, where do you "store the iterator"? – Kelly Bundy Apr 28 '23 at 21:16
  • If there are no duplicate values you could use a `dictionary`. This is like a hashtable and provides constant time deletion and retrievals. Obviously, however, it only will work if there're no duplicates. – nulzo Apr 28 '23 at 21:16
  • @KellyBundy, the elements don't have to be hashable, there are duplicates. By "futher enumeration" I mean that I will use this container only to add elements (like to a set, doesn't matter where), delete a specific element and enumerate over the all elements – g00dds Apr 28 '23 at 21:16
  • @KellyBundy, I can store the iterator in the element itself (the element in my case is an object), or to store it with it as a pair – g00dds Apr 28 '23 at 21:17
  • Hmm. Non-hashable elements makes this hard, I can't think offhand of any Python data structure in the core library that will work as well as a doubly linked list with external pointers to the nodes. If you could restrict to hashable elements, I'd say use a `collections.Counter` as a multiset. Some casual googling suggests https://pypi.org/project/llist/ as one third-party package with a C back end, but I can't vouch for it from personal experience. – Peter DeGlopper Apr 28 '23 at 21:19
  • With "enumerate", do you mean iterate? – Kelly Bundy Apr 28 '23 at 21:19
  • @KellyBundy, yes, I mean iterate. Sorry for the confusion – g00dds Apr 28 '23 at 21:21
  • @PeterDeGlopper, thank you for the llist library recommendation, that might actually be what I need as I can see it in their examples – g00dds Apr 28 '23 at 21:22
  • 1
    @g00dds - credit to https://stackoverflow.com/a/19752474/2337736 if it works out - note the 2013 date on the answer though! – Peter DeGlopper Apr 28 '23 at 21:23
  • Do you want to add/delete *during* an iteration? Or only *between* iterations? – Kelly Bundy Apr 28 '23 at 21:40
  • @KellyBundy, I need to add/delete only between iterations – g00dds Apr 28 '23 at 21:46
  • 1
    It would be helpful if the question had testing code that initialises the wanted structure and performs operations on it (with wanted method calls) and prints out results, so that if we propose something we can check whether the proposed solution satisfies *all* your needs. – trincot Apr 29 '23 at 07:15

2 Answers2

2

Two ideas, showcasing them as in your answer.

Solution 1: collections.OrderedDict

Your values are the dict values and increasing numbers are the dict keys.

from collections import OrderedDict

class Object:
    def __init__(self, i):
        self.i = i
        self.node = None

objs = OrderedDict()
objs_last = 0

# Initialize the collection
for i in range(10):
    obj = Object(i)
    
    # For example, store fifth element to further remove it
    if i == 5:
        obj5 = obj

    objs_last += 1
    objs[objs_last] = obj

    # Store the node in the collection containing the object. Used to further remove the object
    obj.node = objs_last

print([obj.i for obj in objs.values()])
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Delete the fifth element from the collection using stored reference to containing node
del objs[obj5.node]

print([obj.i for obj in objs.values()])
# [0, 1, 2, 3, 4, 6, 7, 8, 9]

Solution 2: list

Store the element's list index. To delete an element, instead replace it with the last element in the list (and update that element's stored index).

class Object:
    def __init__(self, i):
        self.i = i
        self.node = None

objs = []

# Initialize the collection
for i in range(10):
    obj = Object(i)
    
    # For example, store fifth element to further remove it
    if i == 5:
        obj5 = obj

    objs.append(obj)

    # Store the node in the collection containing the object. Used to further remove the object
    obj.node = len(objs) - 1

print([obj.i for obj in objs])
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Delete the fifth element from the collection using stored reference to containing node
objs[obj5.node] = objs[-1]
objs.pop().node = obj5.node

print([obj.i for obj in objs])
# [0, 1, 2, 3, 4, 9, 6, 7, 8]

You could do obj.node = len(objs) before the objs.append(obj), then you don't need the - 1. I just tried to stay as close as possible to your answer.

Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65
  • I really liked the second solution, very simple but powerful – g00dds Apr 28 '23 at 22:41
  • 1
    @g00dds Pretty standard technique, but I needed your clarifications and your answer to see whether/how we can apply it here. So thanks for being so responsive :-). Btw note the added paragraph at the end, about the optimization. – Kelly Bundy Apr 28 '23 at 22:49
  • I love the second approach – JonSG Apr 28 '23 at 23:26
1

As Peter DeGlopper recommended, I will use llist package (github). All credits to this answer.

Here is an example how I use it:

from llist import dllist

class Object:
    def __init__(self, i):
        self.i = i
        self.node = None

objs = dllist()

objs_by_id = {}

# Initialize the list
for i in range(10):
    obj = Object(i)

    # In practice, there isn't any object identifier. dict objs_by_id only for this example.
    objs_by_id[i] = obj

    objs.append(obj)
    obj.node = objs.last

print([obj.i for obj in objs])
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Delete 5, 7, 3 elements from the list
objs.remove(objs_by_id[5].node)
objs.remove(objs_by_id[7].node)
objs.remove(objs_by_id[3].node)

print([obj.i for obj in objs])
# [0, 1, 2, 4, 6, 8, 9]
g00dds
  • 195
  • 8