0

I have an excel file with 3 columns: id, url1 and url2. Both url1 and url2 contain the URL of an image.

How to get the images and paste to WORD and PDF in a table format? There are 3 columns: id, image from url1 and image from url2.

import pandas as pd
import urllib
from docx import Document
from docx.shared import Inches

df = pd.read_excel('data.xlsx')
document = Document()
p = document.add_paragraph()
r = p.add_run()
r.add_picture('a.jpg')#OK
url = r'http://www.example.com/a.jpg'
r.add_picture(urllib.request.urlopen(url))#fail, how to do it?

document.save('demo.docx') 

Thank you very much.

Chan
  • 3,605
  • 9
  • 29
  • 60

2 Answers2

1
  1. You can look into: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html to read the excel file.
  2. You can use a simple for loop to loop through the data.
  3. You can use http://docs.python-requests.org/en/master/ to send a get request to the url and receive the image.
  4. You can use https://pillow.readthedocs.io/en/5.0.0/ to manipulate the image.
  5. You can use http://python-docx.readthedocs.io/en/latest/user/documents.html to save to the word file.

I can't however do all the work for you.

Edit:
I haven't really used urllib, but I can download an image using requests by

 x = requests.get("https://www.pythonsheets.com/_static/guido.png")

I can then open the file using:

from PIL import Image
from StringIO import StringIO
Image.open(StringIO(x.content)).show()

So, that shows I can open download the image file using requests. You can try saving x.content in the word document.

Anuj Gautam
  • 1,235
  • 1
  • 7
  • 14
  • Thank you, I just tried to do what you suggested but the problem is inserting an image in the url. How to do that? – Chan Jan 19 '18 at 09:56
  • This can help you as well. (https://stackoverflow.com/questions/7391945/how-do-i-read-image-data-from-a-url-in-python) – Anuj Gautam Jan 19 '18 at 14:50
1

Try this:

import io
import urllib
from docx import Document
from docx.shared import Inches

document = Document()
p = document.add_paragraph()
r = p.add_run()
url = r'http://www.example.com/a.jpg'
io_url = io.BytesIO(urllib.request.urlopen(url).read())
r.add_picture(io_url)
document.save('demo.docx') 
Frankie
  • 744
  • 1
  • 9
  • 14
  • Thank your for your suggestion, Frankie. – Chan Jan 22 '18 at 01:49
  • throws : ---> 30 from exceptions import PendingDeprecationWarning 31 from warnings import warn 32 ModuleNotFoundError: No module named 'exceptions' in from docx import Document – khanna Apr 20 '20 at 06:12