0

My code takes two separate lists A and B and shifts one "past" the other (Picture a train passing by a parked train), taking the square of the dot product of the sequences at each shift.

I don't really know many tricks in Python, so perhaps my solution is pretty clumsy.

for shift in range (1,len(B)):
   total += np.dot( A[-shift:] , B[:shift] )**2 + np.dot( A[:shift] , B[-shift:] )**2
total += (np.dot(A,B))**2

It functions as is, but I'm now working with such massive sets of data that speed is becoming a serious issue.

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
schwooples
  • 23
  • 5

1 Answers1

2

When creating slices in python (eg A[-shift:]) you are creating a copy of the array. For example, you can see this by doing:

A=[1,2,3]
B=A[:1]
B[0]=17
A // => A is still [1,2,3] not [17,2,3]

However you can use numpy arrays to avoid this copying:

A=numpy.array([1,2,3])
B=A[:1]
B[0]=17
A /// => A is numpy.array([17,  2,  3])

So if you use numpy arrays you'll have a lot less data copying and I suspect your code will be more efficient. But as always; benchmark this to confirm it.

See https://stackoverflow.com/a/5131563/922613 or https://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html for more info

I benchmarked it with the following script:

import numpy as np

def normal_arrays():
    A=[1,2,3,4]
    B=[1,2,3,4]
    total = 0
    for shift in range (1,len(B)):
        total += np.dot( A[-shift:] , B[:shift] )**2 + np.dot( A[:shift] , B[-shift:] )**2
        total += (np.dot(A,B))**2


def numpy_arrays():
    A=np.array([1,2,3,4])
    B=np.array([1,2,3,4])
    total = 0
    for shift in range (1,len(B)):
        total += np.dot( A[-shift:] , B[:shift] )**2 + np.dot( A[:shift] , B[-shift:] )**2
        total += (np.dot(A,B))**2


if __name__ == "__main__":
    import timeit
    print('normal arrays', timeit.timeit(normal_arrays))
    print('numpy arrays', timeit.timeit(numpy_arrays))

My results were a ~50% runtime improvement:

normal arrays 21.76756980700884
numpy arrays 11.998605689994292
George
  • 4,147
  • 24
  • 33