I've been following the fastai course on machine learning. Got up to lesson four and thought I'd use what I've learned to create a model that predicts hand-written letters. The code they used to load their training dataset is as follows:
pets1 = DataBlock(blocks = (ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(seed=42),
get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'))
pets1.summary(path/"images")
This works when you have image files but the dataset files I have are
emnist-letters-train-images-idx3-ubyte
emnist-letters-train-labels-idx1-ubyte
emnist-letters-test-images-idx3-ubyte
emnist-letters-test-labels-idx1-ubyte
I could extract all the images from those files but is there a way I can load the ubyte
files into my program? The files have the same format as the MNIST digits dataset.