2

I am trying to grab a word count from an uploaded word doc (.doc, .docx, .rtf) but it always carries through the annoying Word formatting.

Anybody tackled this issue before and know how to solve it? Thanks :)

LittleBobbyTables - Au Revoir
  • 32,008
  • 25
  • 109
  • 114
Scott Bowers
  • 175
  • 3
  • 13
  • Possible duplicate: http://stackoverflow.com/questions/7330660/count-number-of-words-from-doc-txt-docx-files – Schlaus Jul 31 '13 at 10:29

1 Answers1

6

You will need to:

  1. Distinguish the file type

    $file_name = $_FILES['image']['name'];
    $file_extn = end(explode(".", strtolower($_FILES['image']['name'])));
    
    if($file_extn == "doc" || $file_extn == "docx"){
        docx2text();
    }elseif($file_extn == "rtf"){
        rtf2text();
    }
    
  2. Convert the document to text

    https://stackoverflow.com/a/7371315/2512934 for doc or docx http://webcheatsheet.com/php/reading_the_clean_text_from_rtf.php for rtf

  3. Count the words http://php.net/manual/en/function.str-word-count.php

Community
  • 1
  • 1
Ashneil Roy
  • 154
  • 5
  • Thanks for your reply :) This work perfectly for .docx but unfortunately doesn't for .rtf. Don't suppose you could help me with that too? :) – Scott Bowers Jul 31 '13 at 13:34
  • I've edited the answer. If I have answered your question, please mark mine as the answer to the question. – Ashneil Roy Jul 31 '13 at 14:12