2

I am trying to use the 'visual_92_categories' data set of mne-python, but when I want to do filtering and extracting the epochs, I get memory error! my RAM is 7G. I am wondering if someone could help me. Is there any memory limitation with python or jupyter notebook? Thanks

data_path = visual_92_categories.data_path()    
# Define stimulus - trigger mapping
fname = op.join(data_path, 'visual_stimuli.csv')
conds = read_csv(fname)
max_trigger = 92
conds = conds[:max_trigger]  
conditions = []
for c in conds.values:
    cond_tags = list(c[:2])
    cond_tags += [('not-' if i == 0 else '') + conds.columns[k]
                  for k, i in enumerate(c[2:], 2)]
    conditions.append('/'.join(map(str, cond_tags)))
print(conditions[24])
event_id = dict(zip(conditions, conds.trigger + 1))
n_runs = 4  # 4 for full data (use less to speed up computations)
fname = op.join(data_path, 'sample_subject_%i_tsss_mc.fif')
raws = [read_raw_fif(fname % block) for block in range(n_runs)]
raw = concatenate_raws(raws)    
events = mne.find_events(raw, min_duration=.002)    
events = events[events[:, 2] <= max_trigger]       
picks = mne.pick_types(raw.info, meg=True)
epochs = mne.Epochs(raw, events=events, event_id=event_id, baseline=None,
                    picks=picks, tmin=-.1, tmax=.500, preload=True)
y = epochs.events[:, 2]           
X1 = epochs.copy().get_data()
S.A.
  • 1,819
  • 1
  • 24
  • 39
Sima Hghz
  • 33
  • 2

1 Answers1

0

Execution of this code needs more than 7Gb of memory for me. Even the X1 array is about 4Gb. But it's type is float64, so if you can't get more memory, try to save it as float32 (memory consumption will be halved). It's acceptable decrease in accuracy in most cases.

Also you probalby could try to process data block by block, save it to disk as numpy.array, and when it's finished, upload and concatenate arrays:

# leaving initial part intact
import pickle  # need to save a data

for block in range(n_runs):
    raw = mne.io.read_raw_fif(fname % block)
    # raw = concatenate_raws(raws)
    events = mne.find_events(raw, min_duration=.002)
    events = events[events[:, 2] <= max_trigger]
    picks = mne.pick_types(raw.info, meg=True)
    try:
        epochs = mne.Epochs(raw, events=events, event_id=event_id, base
    line=None,
    picks=picks, tmin=-.1, tmax=.500, preload=True)
    except ValueError:  # there's no correct data in some blocks, catch exception
        continue
       y = epochs.events[:, 2].astype('float32')
       X1 = epochs.copy().get_data().astype('float32')
       pickle.dump(y, open('y_block_{}.pkl'.format(block), 'wb'))  # use convenient names 
       pickle.dump(X1, open('x_block_{}.pkl'.format(block), 'wb'))

# remove unnecessary objects from memory
del y
del X1
del raw
del epochs

X1 = None  # strore x_arrays
y = None  # sore y_s
for block in range(n_runs):
    try:
        if X1 is None:
            X1 = pickle.load(open('x_block_{}.pkl'.format(block), 'rb'))
            y = pickle.load(open('y_block_{}.pkl'.format(block), 'rb'))
        else:
            X1 = np.concatenate((X1, pickle.load(open('x_block_{}.pkl'.format(block), 'rb'))))
             y = np.concatenate((y, pickle.load(open('y_block_{}.pkl'.format(block), 'rb'))))
     except FileNotFoundError:  # if no such block from the previous stage
         pass

So, this code works for me with no memory exhaustion (i.e. < 7 Gb), but I'm not absoutely sure that the mne processes all the blocks independently and it's equivalent code. At least this code creates an array without ~0.5% of samples. Somebody more experienced with mne than me probably fix it.

Mikhail Stepanov
  • 3,680
  • 3
  • 23
  • 24