What you want is an ifilter for .doc(x) files. Ifilters were designed to be used by Windows for its indexing service, but they are frequently pressed into use also for other applications to read text from binary files that contain text. IFilters are frequently released for free - I believe this contains the correct ifilters for doc/docx files (and other Office files).
That said, I've never used the ifilter interface in .net, only in unmanaged c++, but it should be possible. A quick googling turned up this as a likely place to start (it has some recommendations of things to avoid, and some code. I make no guarantee that the code works, you might have to find something else. But the ifilter technology itself does work, I've used it in projects before. Other than the ifilter for pdfs that ships with Reader, which only just "works", barely, last I checked. The Office ifilters work fine, though.)