0

I am comparing two lists of integers and am trying to access the lowest value without using a for-loop as the lists are quite large. I have tried using set comparison, yet I receive an empty set when doing so. Currently my approach is:

differenceOfIpLists = list(set(reservedArray).difference(set(ipChoicesArray)))

I have also tried:

differenceOfIpLists = list(set(reservedArray) - set(ipChoicesArray))

And the lists are defined as such:

reservedArray = [169017344, 169017345, 169017346, 169017347, 169017348, 169017349, 169017350, 169017351, 169017352, 169017353, 169017354, 169017355, 169017356, 169017357, 169017358, 169017359, 169017360, 169017361, 169017362, 169017363, 169017364, 169017365, 169017366, 169017367, 169017368, 169017369, 169017600, 169017601, 169017602, 169017603, 169017604, 169017605, 169017606, 169017607, 169017608, 169017609, 169017610, 169017611, 169017612, 169017613, 169017614, 169017615, 169017616, 169017617, 169017618, 169017619...]

ipChoicesArray = [169017344, 169017345, 169017346, 169017347, 169017348, 169017349, 169017350, 169017351, 169017352, 169017353, 169017354, 169017355, 169017356, 169017357, 169017358, 169017359, 169017360, 169017361, 169017362, 169017363, 169017364, 169017365, 169017366, 169017367, 169017368, 169017369, 169017370, 169017371, 169017372, 169017373, 169017374, 169017375, 169017376, 169017377, 169017378, 169017379, 169017380, 169017381, 169017382...] 

Portions of these lists are the same, yet they are vastly different as the lengths are:

reservedArrayLength = 6658
ipChoicesArray = 65536

I have also tried converting these values to strings and doing the same style of comparison, also to no avail.

Once I am able to extract a list of the elements in the ipChoicesArray that are not in the reservedArray, I will return the smallest element after sorting.

I do not believe that I am facing a max length issue...

ilu
  • 11
  • 1

4 Answers4

0

The obvious solution (you just did the set difference in the wrong direction):

print(min(set(ipChoicesArray) - set(reservedArray)))

You said they're always sorted, and your reverse difference being empty (and thinking about what you're doing) suggests that the "choices" are a superset of the "reserved", so then this also works and could be faster:

print(next(c for c, r in zip(ipChoicesArray, reservedArray) if c != r))
no comment
  • 6,381
  • 4
  • 12
  • 30
0

Subtracting the sets should work as you desire, see below:

ipChoicesArray = [1,3,4,7,1]
reservedArray = [1,2,5,7,8,2,1]
min(list(set(ipChoicesArray) - set(reservedArray)))

###Output###
[3]

By the way, max list is a length of 536,870,912 elements

noahtf13
  • 333
  • 1
  • 7
0

without using a for-loop as the lists are quite large

The presumption that a for-loop is a poor choice because the list is large is likely incorrect. Creation of a set from a list and vice-versa will not only iterate through the containers under the hood anyway (just like a for-loop) in addition to allocating new containers and taking up more memory. Profile your code before you assume something won't perform well.

That aside, in your code it seems the reason you are getting an empty result is because your difference is inverted. To get the elements in ipChoicesArray but not in reservedArray you want to difference the latter from the former:

diff = set(ipChoicesArray) - set(reservedArray)
Luis
  • 1,210
  • 2
  • 11
  • 24
  • "under the hood" in C code, though, which quite possibly *is* faster. – no comment Aug 30 '21 at 17:27
  • @don'ttalkjustcode of course; I don't mean to suggest the looping is identical but we'll need to balance that against the extra memory allocation. Hence my "likely" qualifier and plea to profiling – Luis Aug 30 '21 at 17:36
  • Yeah ok, although in their case, I'm guessing their "ip choices" being called that and being exactly 2^16 large and having values (2579<<16)+0, (2579<<16)+1, (2579<<16)+2, etc, that's likely not just an example. And it's fairly small :-) – no comment Aug 30 '21 at 17:43
0

Disclaimer: Python docs states that

A set is an unordered collection with no duplicate elements.

But I can see that the output of an unordered set is an ordered set:

s = {'z', 1, 0, 'a'}
s #=> {0, 1, 'a', 'z'}
next(iter(s)) #=> 0 

So, I don't know if this approach is reliable. Maybe some other user can deny or confirmi this with an appropriate reference to the set behaviour.

Having said this...


Don't know if I'm getting the point, but..

Not knowing where the smallest value is, you could use this approach (here using smaller values and shorter list):

a = [2, 5, 5, 1, 6, 7, 8, 9]
b = [2, 3, 4, 5, 6, 6, 1]

Find the smallest of the union:

union = set_a | set_b
next(iter(union))
#=> 1

Or just:

min([next(iter(set_a)), next(iter(set_b))])
#=> 1

Or, maybe this fits better your question:
next(iter(set_a-set_b)) #=> 8
iGian
  • 11,023
  • 3
  • 21
  • 36