1

I am trying to use Open XML library to read a docx file like this =

White Noise
Rain Sounds
1

Hot N*gga
Bobby Shmurda
2

Ric Flair Drip (& Metro Boomin)
21 Savage , Offset , Metro Boomin
3

Plastic
Jaden
4

my code is =

public static void OpenWordprocessingDocumentReadonly(string filepath)
        {
            // Open a WordprocessingDocument based on a filepath.
            using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(filepath, false))
            {
                // Assign a reference to the existing document body.  
                Body body = wordDocument.MainDocumentPart.Document.Body;


                Console.Write(body.InnerText);
                Console.ReadKey();
            }
        }

but readed string is this =

White NoiseRain Sounds1Hot N*ggaBobby Shmurda2Ric Flair Drip (& Metro Boomin)21 Savage , Offset , Metro Boomin3PlasticJaden

How to read line by line ?

Ege
  • 138
  • 1
  • 14
  • that is text not DOCX content. – Mark Schultheiss May 12 '19 at 21:11
  • 1
    What does your docx file *actually* look like? It's not juts plain text like you've written. XML is an element-based doc. You will likely have to read element by element. – Joe May 12 '19 at 21:11
  • LMGTFY https://learn.microsoft.com/en-us/dotnet/api/documentformat.openxml.packaging.wordprocessingdocument?view=openxml-2.8.1 – Mark Schultheiss May 12 '19 at 21:18
  • This text is in a docx file. But i find the problem. I make this docx file by copy and pasting from a text on google chrome browser. And browsers next line character(ENTER or \n) is not same as my keyboards next line character(and also Open XML Library's). I get element by element when i put an ENTER character between the strings on the file. So i think i must put an ENTER between all string data. Do you have another opinion ? @MarkSchultheiss – Ege May 12 '19 at 21:29
  • it is not really my practice to conduct consulting via comments on questions. – Mark Schultheiss May 12 '19 at 21:31
  • if you have manual line breaks instead of paragraphs, this answer seems to handle them https://stackoverflow.com/a/24535929/1383168 – Slai May 12 '19 at 21:37
  • thanks i will use it @Slai – Ege May 12 '19 at 22:10

1 Answers1

4

To loop over paragraphs :

using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(filepath, false))
{
    var paragraphs = wordDocument.MainDocumentPart.RootElement.Descendants<Paragraph>();
    foreach (var paragraph in paragraphs)
    {
        Console.WriteLine(paragraph.InnerText);
    }
    Console.ReadKey();
}
Slai
  • 22,144
  • 5
  • 45
  • 53
  • In an organic collected docx data, your code will run correctly. But there is a problem, I make this docx file by copy and pasting from a text on google chrome browser. And browsers next line character(ENTER or \n) is not same as my keyboards next line character(and also Open XML Library's). I get element by element when i put an ENTER character between the strings on the file. So i put an ENTER between all string data. – Ege May 12 '19 at 22:04