Use Boolean vectors to select values from numpy arrays

Question

I have several .txt files containing reaction times (header: 'RT') and correct vs. incorrect response (header: 'error', zeros for correct, ones for incorrect). This is a slight variation from the book: 'Python for Experimental Psychologists'.

Now I want to use Boolean vectors to select values from numpy arrays (e.g. only the reaction times for correct responses). Running the python scripts lead to the following error:

select['correct'] = data['error'] == 0

KeyError: 'error'

Here is to code I'm currently working on:

import numpy as np
import glob
import os

# read in file paths
DIR = os.path.dirname(os.path.abspath(__file__))
DATA_DIR = os.path.join(DIR, 'Pilotdata')

# define total number of participants
N = 27
counter = 0

# create empty arrays to store data
rt = np.zeros((2, 2, N))

data_file = glob.glob(os.path.join(DATA_DIR, 's[0-9][0-9]_main_data.txt'))


# read in data
for pnr in range(0, N):
    counter += 1
    RAW = np.loadtxt(data_file[counter], dtype=str, unpack=True)

    data = {}

    for i in range(len(RAW)):
        VARNAME = RAW[i][0]

        try:
            VALUES = RAW[i][1:].astype(float)

        except:
            VALUES = RAW[i][1:]

        data[VARNAME] = VALUES

    select = {}
    select['correct'] = data['error'] == 0
    select['incorrect'] = data['error'] == 1

It seems like it's a problem with the dictionary i created to store the values. So here is an excerpt of the output:

"b'error'": array([
"b'0'", "b'0'", "b'0'", "b'0'", "b'0'", "b'0'", "b'0'", "b'1'",
"b'0'", "b'1'", "b'0'", "b'0'", "b'0'", "b'0'", "b'0'", "b'0'",
"b'0'", "b'0'", "b'1'", "b'0'", "b'1'", "b'1'", "b'0'", "b'1'", ...

Thanks in advance!

EDIT: Changing the Python interpreter from 3 to 2 did the trick. Is there a way to get the code working in Python3?

EDIT2: Using np.genfromtxt instead of np.loadtxt solved the problem.

score 0 · Answer 1 · answered Nov 01 '17 at 10:23

I think the problem is that you read the array as byte literals (see documentation), so your dictionary does not contain the string "error" as a key, but rather the byte literal 'b"error"'. You can decode it via b"error".decode("utf-8") though:

for i in range(len(RAW)):
    VARNAME = RAW[i][0].decode("utf-8")

    try:
        VALUES = RAW[i][1:].decode("utf-8").astype(float)

    except:
        VALUES = RAW[i][1:]
        # I am not even sure what you are trying to
        # catch here

This could/should do the trick. Apparently numpy.loadtxt operates in byte mode, which is the default string type in Python 2 (do you use Python 2? If yes, do move to Python 3, it is much cooler! :D ). For more info, look at the great explanation here, where people also suggest that you can save yourself the trouble when reading the file, with RAW = np.loadtxt(data_file[counter], dtype=str, unpack=True).astype(str).

Thans for your suggestion. My problem is rooted in the different Python Version indeed. Sadly your changes resulted in a different error (AttributeError: 'numpy.str_' object has no attribute 'decode'). Changing the interpreter from Python3 to Python2 did the trick (as mentioned I used a textbook example). — STD, Nov 01 '17 at 10:40

score 0 · Accepted Answer · answered Nov 01 '17 at 11:04

0

Using np.genfromtxt instead of np.loadtxt solved the problem.

answered Nov 01 '17 at 11:04

STD

27
6

Use Boolean vectors to select values from numpy arrays

2 Answers2