0

I need to delete line breaks in a document docx. My code is this, but it doesn't work with line breaks, only it works with text:

XWPFParagraph toDelete = doc.getParagraphs().stream()
            .filter(p-> StringUtils.equalsAnyIgnoreCase("\n", p.getParagraphText()))
            .findFirst().orElse(null);
    if(toDelete!=null){

        doc.removeBodyElement(doc.getPosOfParagraph(toDelete));
    }

1 Answers1

0

The text in an XWPFParagraph is composed from one or more XWPFRun objects XWPFRun has getText() and setText() methods. Apache StringUtils.remove(str, remove) removes all occurrences of "remove" from the "str" string.

I'm still old-school, here is an imperative (and untested) solution:

for (final XWPFParagraph currentParagraph :  doc.getParagraphs())
{
    for (final XWPFRun currentRun : currentParagraph.getRuns())
    {
        final String strippedText;
        final String text = currentRun.getText(0); // Start at position 0
        
        strippedText = StringUtils.remove(text, "\n");
        
        currentRun.setText(0):
    }
}

Note: Apache StringUtils.remove is null safe.

This answer is a simplification of the accepted answer to this question: Replacing a text in Apache POI XWPF

DwB
  • 37,124
  • 11
  • 56
  • 82
  • Shouldn't you keep calling getText(n) with increasing values of n until you get a null response? Seems like POI should have a better way to iterate over the text runs of the XWPFRun - but XWPFRun doesn't expose how many text runs are represented under the hood by the XWPFRun. I raised https://bz.apache.org/bugzilla/show_bug.cgi?id=66260 – PJ Fanning Sep 14 '22 at 01:57
  • getText(0) tells POI: retrieve the text for this element starting at index 0 – DwB Sep 14 '22 at 14:46