Why operation on list is slower than operation on numpy array

Question

I am doing a project on encrypting data using RSA algo and for that, I have taken a .wav file as an input and reading it by using wavfile and I can apply the key (3, 25777) but when I am applying the decryption key (16971,25777) it is giving wrong output like this:

The output I'm getting:

                    [[     0 -25777]
                    [     0 -25777]
                    [     0 -25777]
                    ...
                    [-25777 -25777]
                    [-15837 -15837]
                    [ -8621      1]]

output i want:

This was happening only with the decryption part of the array so I decided to convert the 2d array to a 2d list. After that, it is giving me the desired output but it is taking a lot of time to apply the keys to all the elements of the list(16min, in case of array it was 2sec). I don't understand why it is happening and if there is any other solution to this problem ?

here is the encryption and decryption part of the program:

    #encryption
    for i in range(0, tup[0]): #tup[0] is the no of rows
        for j in range(0, tup[1]): #tup[1] is the no of cols
            x = data[i][j] 
            x = ((pow(x,3)) % 25777) #applying the keys
            data[i][j] = x #storing back the updated value

    #decryption
    data= data.tolist() #2d array to list of lists 
    for i1 in (range(len(data)):
        for j1 in (range(len(data[i1]))):
            x1 = data[i1][j1] 
            x1 = (pow(x1, 16971)%25777) #applying the keys
            data[i1][j1] = x1

Looking forward to suggestions. Thank you.

For the performance part you could have a read [here](https://stackoverflow.com/questions/993984/what-are-the-advantages-of-numpy-over-regular-python-lists) it's a discussion basically adressing your questions about performance. — Vulpex, Mar 31 '19 at 11:52
Thank you for providing the link, but can you suggest s better and more efficient way of achieving what i want ? — Sandipan, Mar 31 '19 at 12:02
Read the docs for [pow](https://docs.python.org/3/library/functions.html#pow) — President James K. Polk, Mar 31 '19 at 20:51
and as I am using the `pow()`'s 3rd argument then all the elements must be `int` and `data[i][j]` is not an `int` type data, it is a `numpy.int64/32` type data so your link is no use in case of this question. — Sandipan, Apr 02 '19 at 08:25

score 4 · Accepted Answer · answered Mar 31 '19 at 12:12

4

The occurrence of something like pow(x1, 16971) should give you pause. This will for almost any integer x1 yield a result which a 64 bit int cannot hold. Which is the reason numpy gives the wrong result, because numpy uses 64 bit or 32 bit integers on the most common platforms. It is also the reason why plain python is slow, because while it can handle large integers this is costly.

A way around this is to apply the modulus in between multiplications, that way numbers remain small and can be readily handled by 64 bit arithmetic.

Here is a simple implementation:

def powmod(b, e, m):
    b2 = b                         
    res = 1                        
    while e:                       
        if e & 1:                  
            res = (res * b2) % m   
        b2 = (b2*b2) % m        
        e >>= 1                 
    return res

For example:

>>> powmod(2000, 16971, 25777)
10087
>>> (2000**16971)%25777
10087
>>> timeit(lambda: powmod(2000, 16971, 25777), number=100)
0.00031936285085976124
>>> timeit(lambda: (2000**16971)%25777, number=100)
0.255017823074013

answered Mar 31 '19 at 12:12

Paul Panzer

51,835
3
54
99

wow that really helps but can you explain the powmod() further, I did not understand what its actually doing.... thank you. – Sandipan Mar 31 '19 at 12:27
2

It is computing `b^e mod m` (I'll omit the `mod m` from here but it is understood to be there) using the formula `b^e = prod_k b^(e_k 2^k)` where `sum_k e_k 2^k` is the binary representation of e (`e_k` is one if the kth bit in e is set and zero otherwise). When `e` is large this saves a lot of multiplications because `b^(2^(k+1))` is just `[b^(2^k)]^2`, so we get `log` complexity in `e`. Implementation: we read out the bits of e by right-shifting one-by-one and then &-ing with 1. – Paul Panzer Mar 31 '19 at 12:38
what is `prod_k` ,`e_k` and `sum_k` ? – Sandipan Mar 31 '19 at 12:55
1

`prod_k b^(e_k 2^k)` is short for `b^(e_0 2^0) x b^(e_1 2^1) x b^(e_2 2^2) x etc.`. sum is the same for `+` instead of `x`. – Paul Panzer Mar 31 '19 at 13:03
1

Python's built in `pow()` function already does this automatically when you give it three integer arguments. There's no need to write your own code. – President James K. Polk Mar 31 '19 at 20:48
@JamesKPolk Neat, didn't know that. What my code does and `pow` doesn't: it works with array arguments (except for the middle argument, but that can be fixed if need be). `powmod(np.arange(25777), 16971, 25777)` is quite a bit faster than `[pow(x, 16971, 25777) for x in range(25777)]`. – Paul Panzer Apr 01 '19 at 02:18

Why operation on list is slower than operation on numpy array

1 Answers1

Linked