Unable to load CIFAR-10 dataset: Invalid load key '\x1f'

Question

I'm currently playing around with some neural networks in TensorFlow - I decided to try working with the CIFAR-10 dataset. I downloaded the "CIFAR-10 python" dataset from the website: https://www.cs.toronto.edu/~kriz/cifar.html.

In Python, I also tried directly copying the code that is provided to load the data:

def unpickle(file):
import pickle
with open(file, 'rb') as fo:
    dict = pickle.load(fo, encoding='bytes')
return dict

However, when I run this, I end up with the following error: _pickle.UnpicklingError: invalid load key, '\x1f'. I've also tried opening the file using the gzip module (with gzip.open(file, 'rb') as fo:), but this didn't work either.

Is the dataset simply bad, or this an issue with code? If the dataset's bad, where can I obtain the proper dataset for CIFAR-10?

I installed tensorflow through pip, so `pip install tensorflow`. Not sure if that'd also install keras, but I'm assuming no. — MLavrentyev, Jul 15 '17 at 19:02
I'll take a look at that. It just piques me why the "official" dataset isn't working, with the code and data that's provided on the website — MLavrentyev, Jul 15 '17 at 19:06
I'm surprised too. It should work fine. That code is probably dated. There's something more that needs to be done that I don't know. — cs95, Jul 15 '17 at 19:10
I don't know if this has been resolved yet, but I downloaded the python dataset and pickle works with that dataset. I believe that the dataset that is being used in the tensorflow example is the binary dataset and can't be unpickled. — marqs, Oct 13 '17 at 19:09

score 2 · Answer 1 · answered Sep 27 '17 at 22:21

2

Extract your *.gz file and use this code

from six.moves import cPickle
f = open("path/data_batch_1", 'rb')
datadict = cPickle.load(f,encoding='latin1')
f.close()
X = datadict["data"]
Y = datadict['labels']

answered Sep 27 '17 at 22:21

Mehralian

21
2

Can you elaborate why this shoiuld work? – Vulwsztyn Jun 23 '22 at 21:51

score 1 · Answer 2 · answered Jul 23 '19 at 10:00

Just extract your tar.gz file, you will get a folder of data_batch_1, data_batch_2, ...

After that just use, the code provided to load data into your project :

def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

dict = unpickle('data_batch_1')

score 0 · Answer 3 · answered Feb 02 '18 at 00:01

0

It seems like that you need to unzip *gz file and then unzip *tar file to get a folder of data_batches. Afterwards you could apply pickle.load() on these batches.

answered Feb 02 '18 at 00:01

Peter Guan

1
1
1

score 0 · Answer 4 · answered Mar 03 '21 at 21:10

I was facing the same problem using jupyter(vscode) and python3.8/3.7. I have tried to edit the source cifar.py cifar10.py but without success.
the solution for me was run these two lines of code in separate normal .py file:

from tensorflow.keras.datasets import cifar10
cifar10.load_data()

after that it worked fine on Jupyter.

score 0 · Answer 5 · answered Jun 06 '22 at 10:49

0

Try this:

import pickle
import _pickle as cPickle
import gzip

with gzip.open(path_of_your_cpickle_file, 'rb') as f:
    var = cPickle.load(f)

answered Jun 06 '22 at 10:49

Diego Rando

85
8

score 0 · Answer 6 · answered Nov 14 '22 at 01:49

0

Try in this way

import pickle
import gzip
 with gzip.open(path, "rb") as f:
    loaded = pickle.load(f, encoding='bytes')

it works for me

answered Nov 14 '22 at 01:49

Damish Nisal

41
4

Unable to load CIFAR-10 dataset: Invalid load key '\x1f'

6 Answers6

Linked