6

I understand how to create a masked array, and I would like to use masking in a record array so that I can access this data using named attributes. The masking seems to be "lost" when I create a record array from a masked array:

>>> data = np.ma.array(np.ma.zeros(30, dtype=[('date', '|O4'), ('price', '<f8')]),mask=[i<10 for i in range(30)])
>>> data
masked_array(data = [(--, --) (--, --) (--, --) (--, --) (--, --) (--, --) (--, --) (--, --)
(--, --) (--, --) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0)
(0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0) (0, 0.0)],
         mask = [(True, True) (True, True) (True, True) (True, True) (True, True)
(True, True) (True, True) (True, True) (True, True) (True, True)
(False, False) (False, False) (False, False) (False, False) (False, False)
(False, False) (False, False) (False, False) (False, False) (False, False)
(False, False) (False, False) (False, False) (False, False) (False, False)
(False, False) (False, False) (False, False) (False, False) (False, False)],
   fill_value = ('?', 1e+20),
        dtype = [('date', '|O4'), ('price', '<f8')])

>>> r = data.view(np.recarray)
>>> r
rec.array([(0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0),
           (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0),
           (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0),
           (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0),
           (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0)], 
           dtype=[('date', '|O4'), ('price', '<f8')])

When I access a record the data is not masked:

>>> r.date[0]
0

Unlike in the original array:

>>> data['date'][0]
masked_array(data = --,
             mask = True,
       fill_value = 1e+20)

       fill_value = 1e+20)

What can I do? Does the record array not support masking? Browsing on the web I have seen some code examples that seem to suggest otherwise, but it wasn't very clear. Hoping I can get a good answer here.

gerrit
  • 24,025
  • 17
  • 97
  • 170
Nate Reed
  • 6,761
  • 12
  • 53
  • 67

1 Answers1

4

I haven't found much documentation on numpy.ma.mrecords.MaskedRecords, except for a brief mention here. You can find some examples on how to use it by studying the unit tests that come with numpy. (e.g. /usr/lib/python2.6/dist-packages/numpy/ma/tests/test_mrecords.py).

import numpy as np
import numpy.ma.mrecords as mrecords

data = np.ma.array(
    np.ma.zeros(30, dtype=[('date', '|O4'), ('price', '<f8')]),
    mask=[i<10 for i in range(30)])

r = data.view(mrecords.mrecarray)

print(r.date[0])
# --
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • Thanks, I didn't know about numpy.ma.mrecords. – Nate Reed Aug 27 '11 at 22:27
  • I don't suppose there is a way to mask individual fields? My use case is to add a new derived field called "100-day-high" (or something similar) so I would want to mask this field for the first 100 records. – Nate Reed Aug 28 '11 at 15:21
  • @Nate Reed: I think it is possible. Have you tried something like `data.mask['price'][data['date']<...]=True` ? – unutbu Aug 28 '11 at 15:29
  • Yes, either of these works: data.mask[0]['price'] = True, or data.mask['price'][0] = True. Thanks! – Nate Reed Sep 01 '11 at 02:22