8

I want to find and replace text using Java in doc format and docx format files using Java.

What I tried: I tried reading those files as text files but didn't succeed.

I have no idea how to proceed or what else to try, can anyone give me direction?

Elemental
  • 7,365
  • 2
  • 28
  • 33
ROHITHKUMAR A
  • 99
  • 1
  • 1
  • 6

5 Answers5

10

I hope this would solve your problem my friend. I have written it for docx to search and replace using apache.poi I recommend you to read complete Apache POI for more

public class Find_Replace_DOCX {

     public static void main(String args[]) throws IOException,
       InvalidFormatException,
       org.apache.poi.openxml4j.exceptions.InvalidFormatException {
      try {

       /**
        * if uploaded doc then use HWPF else if uploaded Docx file use
        * XWPFDocument
        */
       XWPFDocument doc = new XWPFDocument(
         OPCPackage.open("d:\\1\\rpt.docx"));
       for (XWPFParagraph p : doc.getParagraphs()) {
        List<XWPFRun> runs = p.getRuns();
        if (runs != null) {
         for (XWPFRun r : runs) {
          String text = r.getText(0);
          if (text != null && text.contains("$$key$$")) {
           text = text.replace("$$key$$", "ABCD");//your content
           r.setText(text, 0);
          }
         }
        }
       }

       for (XWPFTable tbl : doc.getTables()) {
        for (XWPFTableRow row : tbl.getRows()) {
         for (XWPFTableCell cell : row.getTableCells()) {
          for (XWPFParagraph p : cell.getParagraphs()) {
           for (XWPFRun r : p.getRuns()) {
            String text = r.getText(0);
            if (text != null && text.contains("$$key$$")) {
             text = text.replace("$$key$$", "abcd");
             r.setText(text, 0);
            }
           }
          }
         }
        }
       }

       doc.write(new FileOutputStream("d:\\1\\output.docx"));
      } finally {

      }

     }

    }
KishanCS
  • 1,357
  • 1
  • 19
  • 38
4

These document formats are complex objects that you almost certainly don't want to try to parse yourself. I would strongly suggest that you take a look at the apache poi libraries - these libraries have function to load and save doc and docx formats and means to access and modify the content of the files.

They are well documented, open source, currently maintained and freely available.

In Summary use these libraries to: a) load the file b) go through the content of the file programmatically and modify it as you need (i.e. do the search and replace) and c) save it back to disk.

Elemental
  • 7,365
  • 2
  • 28
  • 33
1

In case you want to use Docx4J as library to parse .docx I created a util library for it to do search and replace: https://github.com/phip1611/docx4j-search-and-replace-util

WordprocessingMLPackage template = WordprocessingMLPackage.load(new FileInputStream(new File("document.docx")));;

// that's it; you can now save `template`, export it as PDF or whatever you want to do
Docx4JSRUtil.searchAndReplace(template, Map.of(
    "${NAME}", "Philipp",
    "${SURNAME}", "Schuster",
    "${PLACE_OF_BIRTH}", "GERMANY"
));
phip1611
  • 5,460
  • 4
  • 30
  • 57
  • Additional comments on this can be found at another question/answer: https://stackoverflow.com/a/60384502/2891595 – phip1611 Feb 27 '20 at 17:55
1

For Android Kotlin Users.

private fun modifyDocFile(
    toReplace: String,
    newText: String,
    fileName : String,
    output : String
) {

    try {
        val document = XWPFDocument(OPCPackage.open(fileName))

        document.paragraphs.flatMap { it.runs }
            .forEach {
                //? null safe
                it?.getText(0).run { 
                    if (contains(toReplace)) {
                        it.setText(replace(toReplace, newText),0)
                    }
                }
            }

        document.tables.flatMap {
            it.rows.filterNotNull()
                .flatMap { row: XWPFTableRow? -> row!!.tableCells }
                .flatMap { cell -> cell.paragraphs }
                .flatMap { paragraph -> paragraph.runs }
        }.forEach {
            //? null safe
            it?.getText(0).run {
                if (contains(toReplace)) {
                    it.setText(replace(toReplace, newText),0)
                }
            }
        }


        document.write(FileOutputStream(output))

    } catch (e: IOException) {
        e.printStackTrace()
    }

}
Nrb
  • 3
  • 2
0

I'm using the code like this, it looks really good, thanks.

import java.io.FileOutputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;
import java.util.List;
import java.util.Map;

import org.apache.commons.collections4.map.HashedMap;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;

public class TesteMain {

public static void main(String[] args) {

    Map<String, String> change = new HashedMap<>();
    change.put("nomeContratante", "Jo Luis Pinto"); // word to be replaced
    change.put("customer1", "Maikon");
    String pathOriginal = "C:\\testeDocx\\"; // path template
    String templateDoc = "doc1.docx"; // original document will not be changed
    changeDocx(change, pathOriginal, templateDoc);

}

private static void changeDocx(Map<String, String> change, String pathOriginal, String templateDoc) {
    try {
        // finds the path of the operating system temp folder to create a temporary file
        String tempPath = System.getenv("TEMP") + "\\temp.docx";
        Path dirOrigem = Paths.get(pathOriginal + templateDoc);
        Path dirDestino = Paths.get(tempPath);
        Files.copy(dirOrigem, dirDestino, StandardCopyOption.REPLACE_EXISTING); // copy the template to temporary
                                                                                // directory

        try (XWPFDocument doc = new XWPFDocument(OPCPackage.open(tempPath))) {
            for (XWPFParagraph p : doc.getParagraphs()) {
                List<XWPFRun> runs = p.getRuns();
                if (runs != null) {
                    for (XWPFRun r : runs) {
                        String text = r.getText(0);
                        for (Map.Entry<String, String> entry : change.entrySet()) { // scrolls the map
                            if (text != null && text.contains(entry.getKey())) {
                                text = text.replace(entry.getKey(), entry.getValue()); // replaces the values
                                r.setText(text, 0);
                            }
                        }
                    }
                }
            }

            /*
             * table change for (XWPFTable tbl : doc.getTables()) { for (XWPFTableRow row :
             * tbl.getRows()) { for (XWPFTableCell cell : row.getTableCells()) { for
             * (XWPFParagraph p : cell.getParagraphs()) { for (XWPFRun r : p.getRuns()) {
             * String text = r.getText(0); if (text != null && text.contains("$$key$$")) {
             * text = text.replace("$$key$$", "abcd"); r.setText(text, 0); } } } } } }
             */

            // saves in the original directory a new file with a modified name
            doc.write(new FileOutputStream(pathOriginal + "changed_" + templateDoc)); 
        }
    } catch (Exception e) {
        System.out.println(e.getMessage());
    }
}

}