I have a large numpy array A of shape M*3, whose elements of each row are unique, non-negative integers ranging from 0 to N - 1. In fact, each row corresponds to a triangle in my finite element analysis.
For example, M=4, N=5, and a matrix A looks like the following
array([[0, 1, 2],
[0, 2, 3],
[1, 2, 4],
[3, 2, 4]])
Now, I need to construct another array B of size M*N, such that
B[m,n] = 1 if n is in A[m], or else 0
The corresponding B for the exemplary A above would be
1 1 1 0 0
1 0 1 1 0
0 1 1 0 1
0 0 1 1 1
A loop-based code would be
B = np.zeros((M,N))
for m in range(M):
for n in B[m]:
B[m,n]=1
But since I have large M and N (of scale 10^6 for each), how can I use good Numpy indexing techniques to accelerate this process? Besides, I feel that sparse matrix techniques are also needed since M * N data of 1 byte is about 10**12, namely 1 000 G.
In general, I feel using numpy's vectorization techniques, such as indexing and broadcasting, look more like an ad-hoc, error-prone activity relying on quite a bit of street smarts (or called art, if you prefer). Are there any programming language efforts that can systematically convert your loop-based code to a high-performance vectorized version?