0

I am currently working on a project to manipulate Docx file with the Apache POI project. I have used the api to remove text from a run inside of a text box, but cannot figure out how to remove a paragraph inside a text box. I assume that I need to use the class CTP to obtain the paragraph object to remove. Any examples or suggestion would be greatly appreciated.

  • 1
    This is not answerable that general. It depends on the kind of the text box and on the condition which paragraph shall be removed. Please ask a more concrete question and do providing the code you have used to remove text from a run inside of a text box. – Axel Richter Dec 23 '19 at 06:35

1 Answers1

0

In Replace text in text box of docx by using Apache POI I have shown how to replace text in Word text-box-contents. The approach is getting a list of XML text run elements from the XPath .//*/w:txbxContent/w:p/w:r using a XmlCursor which selects that path from /word/document.xml.

The same of course can be done using the path .//*/w:txbxContent/w:p, which gets the text paragraphs in text-box-contents. Having those low level paragraph XML, we can converting them into XWPFParagraphs to get the plain text out of them. Then, if the plain text contains some criterion, we can simply removing the paragraph's XML.

Source:

enter image description here

Code:

import java.io.FileOutputStream;
import java.io.FileInputStream;

import org.apache.poi.xwpf.usermodel.*;

import org.apache.xmlbeans.XmlObject;
import org.apache.xmlbeans.XmlCursor;

import  org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;

import java.util.List;
import java.util.ArrayList;

public class WordRemoveParagraphInTextBox {

 public static void main(String[] args) throws Exception {

  XWPFDocument document = new XWPFDocument(new FileInputStream("WordRemoveParagraphInTextBox.docx"));

  for (XWPFParagraph paragraph : document.getParagraphs()) {
   XmlCursor cursor = paragraph.getCTP().newCursor();
   cursor.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' .//*/w:txbxContent/w:p");

   List<XmlObject> ctpsintxtbx = new ArrayList<XmlObject>();

   while(cursor.hasNextSelection()) {
    cursor.toNextSelection();
    XmlObject obj = cursor.getObject();
    ctpsintxtbx.add(obj);
   }
   for (XmlObject obj : ctpsintxtbx) {
    CTP ctp = CTP.Factory.parse(obj.xmlText());
    //CTP ctp = CTP.Factory.parse(obj.newInputStream());
    XWPFParagraph bufferparagraph = new XWPFParagraph(ctp, document);
    String text = bufferparagraph.getText();
    if (text != null && text.contains("remove")) {
     obj.newCursor().removeXml();
    }
   }
  }

  FileOutputStream out = new FileOutputStream("WordRemoveParagraphInTextBoxNew.docx");
  document.write(out);
  out.close();
  document.close();
 }
}

Result:

enter image description here

Axel Richter
  • 56,077
  • 6
  • 60
  • 87