1

I am working on my first web scraping project and am trying to write my dataset to a .csv file. Writing string to the .csv seems to work fine:

fieldnames = ['fname', 'lname', 'image']

with open('dataset.csv', 'w', encoding='UTF8', newline='') as f:
    newrow = {'fname': 'John', 'lname': 'Doe'}
    writer.writerow(newrow)

But I also have a list of image urls that I would like to download to the .csv file as .pngs. When I try and do this, however, the image is written to the .csv as a string with this format '<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=120x159 at 0x7FC0A0...>'

Here is the code I have written to do this:

response = requests.get('someurl')
image_bytes = io.BytesIO(response.content)
img = PIL.Image.open(image_bytes)
newrow = {'image': img}
writer.writerow(newrow)

I'm not sure how I can get the actual .pngs to save to the csv file.

martineau
  • 119,623
  • 25
  • 170
  • 301
Bob Samuels
  • 385
  • 2
  • 6
  • 16
  • 1
    You don't save an image into a CSV. Image files are binary. The best you can do is save the image to a unique file name, and store the file name in the CSV. – Tim Roberts Aug 16 '21 at 22:08
  • This sounds like an XY problem ( https://en.wikipedia.org/wiki/XY_problem ). Can you please describe the original thing you are trying to achieve? – bessbd Aug 16 '21 at 22:13
  • CSV files are text files, so you can't store an image in one unless you first convert it to some kind of textual format first, like base64. – martineau Aug 16 '21 at 22:13
  • You could also convert the array of pixels to text and save it in the csv, but it will be very inefficient. – mama Aug 16 '21 at 22:44
  • For an example of the (inefficient) converting of pixels to text see [my answer](https://stackoverflow.com/a/40729543/355230) to the question [How to convert a grayscale image into a list of pixel values?](https://stackoverflow.com/questions/40727793/how-to-convert-a-grayscale-image-into-a-list-of-pixel-values) It's for a grayscale image, but the same basic idea would work for a color one — the difference being that each pixel would then consist of multiple values, such as one each for RGB. – martineau Aug 17 '21 at 07:40

1 Answers1

1

A comma-separated values (CSV) file is a delimited text file. So binary data can not store in CSV. They are a few methods for encoding binary data to text.

I think standart library Base64 is good choice for CSV.

Alternative way is writen in Tim Roberts's answer.

Sergey Zaykov
  • 523
  • 2
  • 9