1

I would like to extract a sub-dictionary from an existing one consisting of a series of keys whose values are all numpy arrays of shape (N, 3) or (N). The sub-dictionary should contain the same keys whose values will be arrays of shape (m,3) or (m), m<N, where the m values are obtained by applying some criterion to one of the key:values pairs.

Here a very simple Python code to exemplify my question.

import sys
import numpy as np

t1 = np.arange(0,30,0.5)
t1n = t1.reshape(20,3)

t2 = np.random.rand(20,1)

dtest = dict(p1=t1n, p2=t2)

ind_ok = np.where(dtest['p2'][dtest['p2'] > 0.7])
sub_dtest = dict(p1=dtest['p1'][ind_ok,:], p2=dtest['p2'][ind_ok,:])

This is a simple dictionary having only two pairs, and I extract the rows from the second pairs veryfying some criterion, e.g. that the values are larger than 0.7. Then I construct the desired sub-dictionary (sub_dtest) by directly indexing each key:values pair with the extracted indexes. When I have many pairs in my dictionary I have to repeat the value assignment for each of them: is there a more elegant way in Python to accomplish this task?

The code above is a simplified version of what I did.

vin60
  • 11
  • 3

1 Answers1

0

Use a dictionary comprehension (dictcomp for short) based on either the fixed set of keys you care about:

sub_dtest = {k: dtest[k][ind_ok,:] for k in ('p1', 'p2')}

or using all key/value pairs in the original dict:

sub_dtest = {k: arr[ind_ok,:] for k, arr in dtest.items()}

The first option requires listing out all the keys you care about, the second is exhaustive, though you could add a condition to exclude some keys, e.g. if you wanted all keys but 'pbad' and 'pworse', you could change it to:

sub_dtest = {k: arr[ind_ok,:] for k, arr in dtest.items() if k not in {'pbad', 'pworse'}}

expanding the set of excluded keys as needed (or reducing to just if k != 'pbad' if you only want to exclude a single key).

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271