0

I am using php fopen on a .doc word document.

I am getting all this "garbly gook" on the top and bottom of the displayed page on the browser when opened. I'm assuming this is meta data of some sort...

This displays on the top before the writing is displayed.

"CB (),-./0123456789@ADRoot Entry

FMicrosoft Word-Dokument MSWordDocWord.Document.89q hhDefault1753A3BOJQJCJsH sH KHPJnHJaJ_H9tH9FFHeading xOJQJCJPJJaJ.B. Text body x / ListJ@@Caption xxCJ6JaJ2IndexJ1tj6/i j78911PGTimes New Roman5Symbol3Arial5SimSun3ArialGMicrosoft YaHei3ArialBhgn033n033 0 0Oh08 @ LXd p58@@@@oW6.,D.,M 0 jCaolan80 t1674/b"

I don't want this, this is obviously not data I want to be displayed, only the writing within the document. Is there a way to remove this? I know a doc file is a sort of blob but is there some structural aspect to it that I'm not aware of to remove this junk before outputting. Thanks so much!

macropod
  • 12,757
  • 2
  • 9
  • 21
  • 2
    Word docs are binary files, you can't just read plain text content out of them. – Alex Howansky Feb 21 '22 at 22:46
  • 1
    Look at https://github.com/PHPOffice/PHPWord.. The same thing would happen if you opened the doc file in a text editor. – user3783243 Feb 21 '22 at 22:47
  • Note that PHPOffice/PHPWord won't work with older .doc files. You'd have to use Word or LibreOffice or something to convert them to docx/odf/rtf first. – Alex Howansky Feb 21 '22 at 22:55
  • 1
    Does this answer your question? [Reading DOC file in php](https://stackoverflow.com/questions/7358637/reading-doc-file-in-php) – user3783243 Feb 21 '22 at 23:07

0 Answers0