For a Paragraph object, how can I determine on which page this is located using the Open XML SDK 2.0 for Microsoft Office ?

- 7,423
- 11
- 39
- 44

- 9,335
- 12
- 66
- 121
-
Short answer: Not possible at the OOXML data level alone. [See here for a detailed explanation.](https://stackoverflow.com/a/40139811/290085) – kjhughes Mar 09 '22 at 03:27
3 Answers
It is not possible to get page numbers for a word document using OpanXml Sdk
as this is handled by the client (like MS Word).
However if the document you are working with is previously opened by a word client and saved back, then the client will add LastRenderedPageBreak
to identify the page breaks. Refer to my answer here for more info about LastRenderedPageBreak
s. This enables you to count for the number of LastRenderedPageBreak
elements before your paragraph to get the current page count.
If this is not the case then the noddy option to work around your requirement is to add footers with page numbers (may be with same colour as your documents to virtually hide it!). Only an option - if you are automating the word document generation using OpenXML sdk
.

- 1
- 1

- 2,551
- 1
- 20
- 30
@Flowerking : thanks for the information.
Because I need to loop all the paragraphs anyway to search for a certain string, I can use the following code to find the page number:
using (var document = WordprocessingDocument.Open(@"c:\test.docx", false))
{
var paragraphInfos = new List<ParagraphInfo>();
var paragraphs = document.MainDocumentPart.Document.Descendants<Paragraph>();
int pageIdx = 1;
foreach (var paragraph in paragraphs)
{
var run = paragraph.GetFirstChild<Run>();
if (run != null)
{
var lastRenderedPageBreak = run.GetFirstChild<LastRenderedPageBreak>();
var pageBreak = run.GetFirstChild<Break>();
if (lastRenderedPageBreak != null || pageBreak != null)
{
pageIdx++;
}
}
var info = new ParagraphInfo
{
Paragraph = paragraph,
PageNumber = pageIdx
};
paragraphInfos.Add(info);
}
foreach (var info in paragraphInfos)
{
Console.WriteLine("Page {0}/{1} : '{2}'", info.PageNumber, pageIdx, info.Paragraph.InnerText);
}
}

- 9,335
- 12
- 66
- 121
-
Nice. I should be delivering similar code in my answer if you included some code in your Q. One catch => `var pageBreak = run.GetFirstChild
();` in Open Xml - Not all `Break`s are pagebreaks! – Flowerking Feb 18 '13 at 20:55 -
6**To all future visitors.. the op thinks this answers his question. But it fails in lot of cases. It fails when you are using multi-column layout. Also `run.GetFirstChild
();` is going to give you all kinds of breaks which might include breaks other than just page breaks. So keep in mind these points when using the above code.** – Flowerking Feb 24 '13 at 22:07 -
1doc where you have LastRenderedPageBreak will have Break as well, so just using Break check will be fine. But there are scenarios where there wont be any breaks but content to extended to multiple pages, How do you identify and separate the content by page? – HaBo Oct 13 '16 at 11:23
Here's an extension method I made for that :
public static int GetPageNumber(this OpenXmlElement elem, OpenXmlElement root)
{
int pageNbr = 1;
var tmpElem = elem;
while (tmpElem != root)
{
var sibling = tmpElem.PreviousSibling();
while (sibling != null)
{
pageNbr += sibling.Descendants<LastRenderedPageBreak>().Count();
sibling = sibling.PreviousSibling();
}
tmpElem = tmpElem.Parent;
}
return pageNbr;
}

- 19
- 2
-
-
This will only count the number of inserted page breaks in an existing document, e.g. once it has been opened in Word, which will insert the breaks. If you are generating the document yourself, using the SDK, the only page breaks in the document, will be the ones you have inserted youself, which you would not need to count. – IanGSY Apr 05 '17 at 12:51