2

I've been doing competitive programming (USACO) for a couple of months now, in which there are time constraints you cannot exceed. I need to create a large matrix, or 2d array, the dimensions being 2500x2500, in which each value is [0,0]. Using list comprehension is taking too much time, and I needed an alternative (you cannot import modules so numpy isn't an option). I had initially done this:

grid = [[[0,0] for i in range(2500)] for i in range(2500)] but it was taking far too much time, so I tried:

grid= [[[0,0]]*2500]*2500,

which gives the same result initially, but whenever I try to change a value, for example: grid[50][120][0]= 1, it changes the 0th index position of every [0,0] to False in the entire matrix instead of the specific coordinate at the [50][120] position, and this isn't the case when I use list comprehension. Does anybody know what's happening here? And any solution that doesn't involve a crazy run time? I started python just a couple of months before competitive programming so I'm not super experienced.

Arjun
  • 25
  • 4
  • 1
    For the differences between the two, please refer to [List of lists changes reflected across sublists unexpectedly](https://stackoverflow.com/questions/240178/list-of-lists-changes-reflected-across-sublists-unexpectedly). – Mechanic Pig Sep 08 '22 at 03:27
  • 2
    `[[[0,0] for i in range(2500)] for i in range(2500)]` is the correct way to do this, as you noticed, `[[0,0]]*2500]*2500` creates 2500 references **to the same inner lists**. Honestly, if `[[[0,0] for i in range(2500)] for i in range(2500)]` is too slow for you, then Python is the wrong language – juanpa.arrivillaga Sep 08 '22 at 03:28
  • The problem with your 2nd approach is that, every [i][j] component is just a linked ojbect to the original at [0][0]. The solution is that you must change the entire element at [i][j], instead of [i][j][0] and [i][j][1]. For example: `grid[i][j] = [0, grid[i][j][1]]` – cao-nv Sep 08 '22 at 04:08
  • As @juanpa.arrivillaga said, Python may not be suitable here.Even using `np.zeros((2500, 2500, 2), int).tolist()` can only double the speed. – Mechanic Pig Sep 08 '22 at 04:11

2 Answers2

2

grid = [[[0,0] for i in range(2500)] for i in range(2500)]

takes around 2.1 seconds on my PC, timing with PowerShell's Measure-Command. Now if the data specifications are strict, there is no magical way to make this faster. However, if the goal is to make this representation generate faster there is a better solution: use tuple instead of list for the inner data (0, 0).

grid = [[(0, 0) for i in range(2500)] for i in range(2500)]

This snippet generates the same informational value in under quarter of the time (0.48 s). Now what you have to consider here is what comes next. When updating these values in the grid, you need to always create a new tuple to replace the old one - which will always be slower than just updating the list value in the original sample code. This is because tuple doesn't support item assignment operation. Replacing a single value is still as easy as grid[50][120] = (1, grid[50][120][1]).

Fast generation - slow replacement, might be handy if there aren't tons of value changes. Hope this helps.

LTJ
  • 1,197
  • 3
  • 16
  • Initially posted the answer suggesting string representation "0,0", which is as fast as the tuple solution. Modified the example answer to use tuples for simplicity (no need to type cast anything if numerical values are requested). – LTJ Sep 08 '22 at 11:49
  • That or using the string representation, as @LTJ said, seems like the best course of action aside from switching languages, I'll give it a go. 2.1 seconds is about right; in the competition they send the results to a server to make things fair. Time lim is 2 seconds, so even without all the code following that bit, its too slow. – Arjun Sep 08 '22 at 23:38
0

The built-in memoryview object seems to be the only thing in the standard library that is similar to nd-array, and its tolist method can provide some acceleration:

In [_]: %timeit [[[0, 0] for i in range(2500)] for i in range(2500)]
477 ms ± 7.91 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [_]: %timeit memoryview(b'\x00' * (2500 * 2500 * 2)).cast('B', (2500, 2500, 2)).tolist()
375 ms ± 11.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

This is almost the same as NumPy's speed:

In [_]: %timeit np.zeros((2500, 2500, 2), int).tolist()
371 ms ± 11.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

If the list is not the only option, simply building a memoryview should be the best solution. Note that bytearray needs to be used instead to make it writable:

In [_]: %timeit memoryview(bytearray(b'\x00') * (2500 * 2500 * 2 * 4)).cast('i', (2500, 2500, 2))
9.71 ms ± 488 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> mem = memoryview(bytearray(b'\x00') * (2500 * 2500 * 2 * 4)).cast('i', (2500, 2500, 2))
>>> mem[0, 0, 0]
0
>>> mem[0, 0, 0] = 1
>>> mem[0, 0, 0]
1
Mechanic Pig
  • 6,756
  • 3
  • 10
  • 31