Vectorized assignment of a 2-dimensional array

Question

I work with Python 2.7, numpy and pandas.

I have :

a function y=f(x) where both x and y are scalars.
a one-dimensional array of scalars of length n : [x0, x1, ..., x(n-1)]

I need to construct a 2-dimensional array D[i,j]=f(xi)*f(xj) where i,j are indices in [0,...,n-1].

I could use loops and/or a comprehension list, but that would be slow. I would like to use a vectorized approach instead.

I thought that "numpy.indices" would help me (see Create a numpy matrix with elements a function of indices), but I admit I am at a loss on how to use that command for my purpose.

Thanks in advance!

For example, do `i` and `j` range over all indices, or are there specific ones you want? — DSM, Jun 16 '14 at 15:15
This sounds like basic broadcasting to me. `x[None,:] * y[:,None]`. But we need to know more about `f` to give a good answer. — senderle, Jun 16 '14 at 15:15
i and j range over every value in [0,1,...,n-1] so D is a square matrix with n rows and n columns. f() is a real scalar function. We can use exp(.) as an example. My f() is different, but I don't think it matters — Charles, Jun 16 '14 at 15:20
Yeah, e.g. if `def f(x): return np.exp(x)` is your function, then `y = f(x); D = y * y[:,None]` should work, but if it were `math.exp(x)` then it wouldn't. — DSM, Jun 16 '14 at 15:25
Ah yes I see. Let me reformulate the problem so that it is more prone to be vectorizable (perhaps). f is in fact a on-dimensional array of length n, so my problem is really to construct a square matrix with elements D[i,j]=f[i]*f[j], where i and j range over [0,...,n-1] - sorry if I was not clear before — Charles, Jun 16 '14 at 15:32
@Charles That makes a significant difference, see my edited answer. — Patrick Collins, Jun 16 '14 at 15:55
senderle's suggestion does the trick : D=f[None,:]*f[None,:] works. thanks again — Charles, Jun 16 '14 at 15:57

score 1 · Answer 1 · answered Jun 17 '14 at 14:32

1

Ignore the comments that dismiss vectorization; it's a good habit to have, and it does deliver performance with the right accelerators. Anyway, what I really wanted to say was that you want to find the outer product:

x_ = numpy.array(x)
y = f(x_)
numpy.outer(y, y)

If you're working with numbers you should be working with numpy data structures anyway. Then you get fast, readable code like this.

answered Jun 17 '14 at 14:32

Emre

5,976
7
29
42

You're welcome. If you found the answers helpful you're encouraged to vote them up. – Emre Jun 17 '14 at 18:30
I Will but i dont have enough réputation points yet ! Still new to all This.. thanks all the same. – Charles Jun 17 '14 at 19:53

Patrick Collins · Accepted Answer · 2014-06-16T15:54:10.100

0

I would like to use a vectorized approach instead.

You sound like you might be a Matlab user -- you should be aware that numpy's vectorize function provides no performance benefit:

The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.

Unless it just so happens that there's already an operation in numpy that does exactly what you want, you're going to be stuck with numpy.vectorize and nothing to really gain over a for loop. With that being said, you should be able to do that like so:

def makeArray():
    a = [1, 2, 3, 4]
    def addTo(arr):
        return f(a[math.floor(arr/4)]) * f(a[arr % 4])
    vecAdd = numpy.vectorize(addTo)
    return vecAdd(numpy.arange(4 * 4).reshape(4, 4))

EDIT:

If f is actually a one-dimensional array, you can do this:

f_matrix = numpy.matrix(f)
D = f_matrix.T * f_matrix

edited Jun 16 '14 at 15:54

answered Jun 16 '14 at 15:32

Patrick Collins

10,306
5
30
69

Thank you all for your suggestions. Very useful. – Charles Jun 16 '14 at 15:42
1

Saying "numpy's vectorized operations provide no performance benefit" is very misleading. Using the `vectorize` **function** doesn't provide any. But in numpy jargon, we say than an operation is vectorized if it doesn't use Python loops, e.g. `x**0.5+np.sin(x)`. – DSM Jun 16 '14 at 15:45
@DSM You're right, I edited my response to make that clearer. My point was more to suggest that, for an arbitrary function `f`, it's unlikely that you can get what you want from `numpy`. Tell me if you think my answer is still unclear. – Patrick Collins Jun 16 '14 at 15:48
Almost any arbitrary function can be rewritten to take advantage of vectorization in numpy: if you can write it in Python, the you can very likely write it in numpy. – Jaime Jun 16 '14 at 15:57
@Jaime What about something like `def f(): return len(requests.get('http://www.google.com').text)`? Or `def f(): print "Hello World"; return 5`? Or `def f(): return hashlib.md5(open('hello.txt').read())`? The majority of Python functions are not operations in linear algebra, and so the majority do not have an easy `numpy` implementation. – Patrick Collins Jun 16 '14 at 16:02

score 0 · Answer 3 · answered Jun 17 '14 at 13:37

You can use fromfunc to vectorize the function then use the dot product to multiply:

f2 = numpy.fromfunc(f, 1, 1)  # vectorize the function
res1 = f2(x)  # get the results for f(x)
res1 = res1[np.newaxis]  # result has to be 2D for the next step
res2 = np.dot(a.T, a)  # get f(xi)*f(xj)

Vectorized assignment of a 2-dimensional array

3 Answers3