I have a huggingface dataset with an image column
ds["image"][0]
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=300x300 at 0x1682DD820>
When I save to disk, load it later I get the image column as bytes:
ds.save_to_disk("./dataset.hf")
ds.load_from_disk("./dataset.hf")
ds["image"][0]
{'bytes': b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xdb\x00C\x00\x08\x06\x06\x07\x06\x05\x08\x07\x07\x07\t\t\x08\n\x0c\x14\r\x0c\x0b\x0b\x0c\x19\x12\x13\x0f\x14\x1d\x1a\x1f\x1e\x1d\x1a\x1c\x1c $.\',
'path': None}
The image column is converted to bytes.
How can I load the dataset and make sure my image column is PIL.JpegImagePlugin.JpegImageFile
?