3

Let's say we have a .doc & .docx files. I want to use LiveDocx in PHP to load the files, read it's content and strip the text from inside it. Then save it to an HTML string.

Can this be done?

I've searched the documentation, and it seams that LiveDocx only loads .doc & .docx template files only!

hakre
  • 193,403
  • 52
  • 435
  • 836
Mohammed J. Razem
  • 342
  • 1
  • 4
  • 19
  • What about other plattform? I mean that you probably can find other platforms that have more feature than LiveDocx. – Kevin Apr 12 '11 at 12:10

4 Answers4

1

You can save using external libraries and simply grab the text from the XML within the files: http://www.webcheatsheet.com/PHP/reading_the_clean_text_from_docx_odt.php

Ashy
  • 2,014
  • 5
  • 21
  • 25
0

I think you can find what you need in this example.

I might be wrong, but I think they call them "template" files because they act like a template but are still normal .doc/.docx documents. I suggest you simply try to run that example.

Udo G
  • 12,572
  • 13
  • 56
  • 89
0

I think you can use TextControl that improves phpLiveDocx TextControl link

Using this you can also import pdf doc and docx

Kevin
  • 552
  • 2
  • 8
  • 17
0

When you do document conversion on LiveDocX, you need to do a mailmerge and then retrieve the document. Even though you aren't inserting any new content, you need to do a mailmerge that replaces a dummy placeholder with dummy content.

So, the process I'd suggest is:

1) Set your source document as local template
2) Merge a dummy field with dummy content
3) Retrieve your document as HTML
4) Use a script server side to remove the html and leave only the content (Something like, remove everything between the HEAD tags, then strip_tags on the rest) 5) You should be left with your content as a simple string - I'm not sure it'll be too meaningful, but might be useful for building something like search indices.

Ross Little
  • 41
  • 2
  • 5