Replacing integers with tuples in a numpy array?

Question

I have a numpy array:

a = [[0 1 2 3 4]
     [0 1 2 3 4]
     [0 1 2 3 4]]

I have a dictionary with values I want to substitute/map:

d = { 0 : ( 000, 001 ),
      1 : ( 100, 101 ),
      2 : ( 200, 201 ),
      3 : ( 300, 301 ),
      4 : ( 400, 401 )}

So that I end up with:

a = [[(000, 001) (100, 101) (200, 201) (300, 301) (400, 401)]
     [(000, 001) (100, 101) (200, 201) (300, 301) (400, 401)]
     [(000, 001) (100, 101) (200, 201) (300, 301) (400, 401)]]

According to this SO answer, one way to do a value map based on a dictionary is:

b = np.copy( a )
for k, v in d.items(): b[ a == k ] = v

This works when the key and value are of the same data type. But in my case, the key is an int while the new value is a tuple (of ints). Accordingly, I get a cannot assign 2 input values error.

Instead of b = np.copy( a ), I have tried:

b = a.astype( ( np.int, 2 ) )

However, I get the reasonable error of ValueError: could not broadcast input array from shape (3,5) into shape (3,5,2).

So, how can I go about mapping ints to tuples in a numpy array?

Do you really **need** numpy arrays for this? Before you start using record-arrays, structured-arrays or object-arrays you should at least consider using a plain python list instead. :) — MSeifert, Apr 04 '17 at 20:18
Yes, I'm using pygame and I need to blit a numpy array. Specifically one with ( r, g, b ) values. **Speed is critical** so I want to do all the mapping using numpy. I have been doing the mapping using standard python, but it is not fast enough. — Jet Blue, Apr 04 '17 at 20:37
Could you maybe add more context then? Is it only `0` and `1` you want to replace or also `2`, `3` and `4`? What should your final array look like? But keep in mind that actually inserting `tuple`s in a numpy array will create an object array that isn't faster (could also be slower) than a python list. — MSeifert, Apr 04 '17 at 20:41
Updated question accordingly. The numbers in the original array correspond to ( r, g, b ) tuples. In the end I want a numpy array of rgb tuples given a numpy array of int codes. — Jet Blue, Apr 04 '17 at 20:46
How about a (3,3,2) shaped array? `np.array(list(adict.values()))[a]` — hpaulj, Apr 04 '17 at 20:48
@hpaulj, I don't understand your suggestion. Can you elaborate? — Jet Blue, Apr 04 '17 at 20:52
@hpaulj Does that really work? I think this could be problematic because dictionaries can be unordered (except for python 3.6). :) — MSeifert, Apr 04 '17 at 20:53
Yes, it does depend on creating an array from the dictionary with all values in order. I suggested the simplest approach, which might not apply in all cases. `@Panzer has more robust ways of creating the `out` array. — hpaulj, Apr 04 '17 at 21:09

Paul Panzer · Answer 1 · 2017-04-04T21:47:52.720

1

How about this?

import numpy as np

data = np.tile(np.arange(5), (3, 1))

lookup = { 0 : ( 0, 1 ),
           1 : ( 100, 101 ),
           2 : ( 200, 201 ),
           3 : ( 300, 301 ),
           4 : ( 400, 401 )}

# get keys and values, make sure they are ordered the same
keys, values = zip(*lookup.items())

# making use of the fact that the keys are non negative ints
# create a numpy friendly lookup table
out = np.empty((max(keys) + 1,), object)
out[list(keys)] = values

# now out can be used to look up the tuples using only numpy indexing
result = out[data]
print(result)

prints:

[[(0, 1) (100, 101) (200, 201) (300, 301) (400, 401)]
 [(0, 1) (100, 101) (200, 201) (300, 301) (400, 401)]
 [(0, 1) (100, 101) (200, 201) (300, 301) (400, 401)]]

Alternatively, it may be worth considering using an integer array:

out = np.empty((max(keys) + 1, 2), int)
out[list(keys), :] = values

result = out[data, :]
print(result)

prints:

[[[  0   1]
  [100 101]
  [200 201]
  [300 301]
  [400 401]]

 [[  0   1]
  [100 101]
  [200 201]
  [300 301]
  [400 401]]

 [[  0   1]
  [100 101]
  [200 201]
  [300 301]
  [400 401]]]

edited Apr 04 '17 at 21:47

answered Apr 04 '17 at 20:55

Paul Panzer

51,835
3
54
99

I'm trying to use numpy built-ins to do the mapping. – Jet Blue Apr 04 '17 at 21:04
@JetBlue does the lookup table change? Because if not you can reuse `out` and I doubt it gets much faster than `result = out[data]`. – Paul Panzer Apr 04 '17 at 21:09
It definetly depends on the size of the array, the number of keys in the dictionary and the range of values of these keys what approach will be the fastest. :) Also the result array might not be contiguous anymore using certain kinds of indexing so subsequent operations might be much slower. – MSeifert Apr 04 '17 at 21:11
`@Jet Blue`, what built-ins are you talking about. Paul is using built-in indexing, `out[a]`. – hpaulj Apr 04 '17 at 21:13
@MSeifert It is true that a too large spread of keys may make this approach infeasible. Re contiguity since the result is newly created I can't see why it shouldn't be contiguous. – Paul Panzer Apr 04 '17 at 21:40

score 1 · Accepted Answer · answered Apr 04 '17 at 21:03

1

You could use a structured array (that's like using tuples but you don't loose the speed advantage):

>>> rgb_dtype = np.dtype([('r', np.int64), ('g', np.int64)])
>>> arr = np.zeros(a.shape, dtype=rgb_dtype)
>>> for k, v in d.items():
...     arr[a==k] = v
>>> arr
array([[(  0,   1), (100, 101), (200, 201), (300, 301), (400, 401)],
       [(  0,   1), (100, 101), (200, 201), (300, 301), (400, 401)],
       [(  0,   1), (100, 101), (200, 201), (300, 301), (400, 401)]], 
      dtype=[('r', '<i8'), ('g', '<i8')])

The for-loop could probably be replaced with some faster operation. However if your a contains very few different values compared to the total size this should be fast enough.

answered Apr 04 '17 at 21:03

MSeifert

145,886
38
333
352

1

The application I'm using (pygame) is fixed on the array type it accepts. I don't have the flexibility to change this. – Jet Blue Apr 04 '17 at 21:12
1

I don't understand. If you look at the output this is exactly what you wanted. (except for the dtype but there is no dtype that allows tuples except `object`). There is nothing in the question about pygame. Please update your question with the **exact** limitations. – MSeifert Apr 04 '17 at 21:17

Replacing integers with tuples in a numpy array?

2 Answers2