3

I want to insert about 250 images with their filename into a docx-file.

My test.py file:

from pathlib import Path
import docx
from docx.shared import Cm

filepath = r"C:\Users\Admin\Desktop\img"
document = docx.Document()

for file in Path(filepath).iterdir():
#    paragraph = document.add_paragraph(Path(file).resolve().stem)
    document.add_picture(Path(file).absolute(), width=Cm(15.0))

document.save('test.docx')

After Debugging I got this Error:

Exception has occurred: AttributeError
'WindowsPath' object has no attribute 'seek'
  File "C:\Users\Admin\Desktop\test.py", line 10, in <module>
    document.add_picture(Path(file).absolute(), width=Cm(15.0))

How can i avoid this Error?

05x
  • 81
  • 1
  • 1
  • 9
  • Possible duplicate of [Windows path in Python](https://stackoverflow.com/questions/2953834/windows-path-in-python) – Andreas Mar 08 '19 at 02:24

5 Answers5

3

Have you tried using io.FileIO?

from io import FileIO

from pathlib import Path
import docx
from docx.shared import Cm

filepath = r"C:\Users\Admin\Desktop\img"
document = docx.Document()

for file in Path(filepath).iterdir():
#    paragraph = document.add_paragraph(Path(file).resolve().stem)
    document.add_picture(FileIO(Path(file).absolute(), "rb"), width=Cm(15.0))

document.save('test.docx')

I encountered the same error using PyPDF2 when passing a file path to PdfFileReader. When I wrapped the PDF file in FileIO like so FileIO(pdf_path, "rb") the error went away and I was able to process the file successfully.

wtee
  • 58
  • 1
  • 5
2

You need to convert the file object to a string type for the Path method.

for file in Path(filepath).iterdir():
# Paragraph = document.add_paragraph(Path(file).resolve().stem)
    document.add_picture(Path(str(file)).absolute(), width=Cm(15.0))
Cliu
  • 21
  • 2
2

The problem is within python-docx (still) as of the current version 0.8.11 (from 31/03/2022). Wherein the assumption is that if it's not a string, it must be a file operator. This is an unfortunate limitation of docx design, surely a holdover from pre-Pathlib days, as Path objects have an open method to directly use them as a file operator, and would work as well as str if they weren't being filtered out with an is_string test.

So in order to work around it, you need to pass in a string. Fortunately, pathlib has good coverage for this. Change your loop to pass in the file name. Also, you're already using Path, so skip the raw strings for your filepath

filepath = Path("C:/Users/Admin/Desktop/img")
# filepath = Path(r"C:\Users\Admin\Desktop\img") # alternatively, gives same results
document = docx.Document()

for file in filepath.iterdir():
#    paragraph = document.add_paragraph(Path(file).resolve().stem)
    document.add_picture(file.as_posix(), width=Cm(15.0))

Additionally, if you want to scrub relative pathing, do not use absolute(), use resolve().

In this case however, you know the files exist. You setup an absolute filepath, so the full path is guaranteed, there is no need for resolve() (or absolute()).

If instead your filepath was relative, you could resolve it once to avoid the overhead of handling each file that comes out of iterdir()

filepath = Path("Desktop/img")
# filepath = Path(r"Desktop\img") # alternatively, gives same results
document = docx.Document()
full_filepath = filepath.resolve() # to Path("C:/Users/Admin/Desktop/img")
# filepath = filepath.resolve() # is also safe

for file in full_filepath.iterdir():
#    paragraph = document.add_paragraph(Path(file).resolve().stem)
    document.add_picture(file.as_posix(), width=Cm(15.0))

But when it's not certain: resolve() will remove any '..' that may enter into your paths. Behavior on Windows can be unpredictable when the location doesn't exist, but as long as the file (including dirs) already exists, resolve() will give you a full, absolute path. If the file doesn't exist, then it will bizarrely only add a full path on Windows if there are relative steps like '..' in the path. On the other hand, absolute() never scrubs '..' but will always add a root path on Windows, so if you need to be sure, you could call absolute() first and then resolve(), and lastly as_posix() for a string: file.absolute().resolve().as_posix()

But be warned: absolute() is not documented, so its behavior could change or be removed without warning.

As others have written, you can also use str(file). Since Path stores posix safe path-strings, you should find str(file) == file.as_posix() is True in all cases.

Zim
  • 410
  • 4
  • 13
1

In my case, changing the '/' for '\' in the path did the trick. Ex: "C:/Users/Admin/Desktop/img" (which I believe is probably what wrapping it in FileIO does, but in my case doing this didn't work)

You can also achieve that using

os.path.join(mydir, myfile)

as explained here https://stackoverflow.com/a/2953843/11126742

Book Of Zeus
  • 49,509
  • 18
  • 174
  • 171
gb7
  • 11
  • 3
1

Simply cast the path object to string:

for file in Path(filepath).iterdir():
    path_str = str(Path(file).absolute())
    document.add_picture(path_str, width=Cm(15.0))

The problem with using WindowsPath object as an input seems to be that the document.add_picture does not know how to use that to open a file. The seek is a method of a file object.

Jman
  • 171
  • 4