0

I need to open a .dotx document, modify the content (or something similar) and put my own data, and then return generated .docx/document.

for exemple in the dotx file, the string "name" should be replaced by "John" in genrated docx file.

public static void main( String[] args ) throws IOException
{
    String inputFile="D:/Copies 2.dotx";
//  String outputeFile="D:/test.txt";
    String outputeFile="D:/test.docx";
    File inFile=new File(inputFile);
    File ouFile=new File(outputeFile);
    Map<String,String> hm = new HashMap<String,String>();
    hm.put("Namur","Youssef");
    App a = new App();
    a.changeData(inFile,ouFile, hm);    
}
private void changeData(File targetFile,File out, Map<String,String> substitutionData) throws IOException{
    BufferedReader br = null;
    String docxTemplate = "";
    try {
        br = new BufferedReader(new InputStreamReader(new FileInputStream(targetFile)));
        String temp;
        while( (temp = br.readLine()) != null) {
            docxTemplate = docxTemplate + temp;   
        }
        br.close();
    } 
    catch (IOException e) {
        br.close();
        throw e;
    }

    Iterator<Entry<String, String>> substitutionDataIterator = substitutionData.entrySet().iterator();
    while(substitutionDataIterator.hasNext()){
        Map.Entry<String,String> pair = (Map.Entry<String,String>)substitutionDataIterator.next();
        if(docxTemplate.contains(pair.getKey())){
            if(pair.getValue() != null)
                docxTemplate = docxTemplate.replace(pair.getKey(), pair.getValue());
            else
                docxTemplate = docxTemplate.replace(pair.getKey(), "NEDOSTAJE");
        }
    }

    FileOutputStream fos = null;
    try{
        fos = new FileOutputStream(out);
        fos.write(docxTemplate.getBytes());
        fos.close();
    }
    catch (IOException e) {
        fos.close();
        throw e;
    }

}

Can someone give me some advice on this?

Ps: i'm using apach POI 3.16

youssef elhayani
  • 625
  • 8
  • 28
  • 1
    Where do you think you are using any `apache poi` classes in your current code? – Axel Richter Feb 08 '19 at 11:26
  • yes indeed in this code i didn't use apache POI , it's run and compile correctly but the output file give me this error when i try to open it "we are sorry we couldn't open your test1 because we found a problem with content " and i'm wondering if there is a way to Replacing content in a word template document with java using POI . – youssef elhayani Feb 08 '19 at 11:58
  • First thing to know: What is a `*.dotx` file? Is it simply a text file? No, it is a file in Office Open XML format which is a ZIP archive containing a special directory structure having different other files (mainly XML files) stored in. So you cannot simply handle it like a text file then. That's what `apache poi` is made for. – Axel Richter Feb 08 '19 at 12:09
  • Second thing to know: What is the difference between a `*.dotx` and a `*.docx` file? It mainly is the content-type. So while read `*.dotx` and then save `*.docx` you also needs changing the content-type settings within the file. See https://stackoverflow.com/questions/54377200/converting-a-file-with-dotx-extension-template-to-docx-word-file/54377500#54377500. – Axel Richter Feb 08 '19 at 12:09
  • @AxelRichter Thank a lot , i think you're right I should convert it first ! – youssef elhayani Feb 08 '19 at 12:18

1 Answers1

2

As it is not as simple to parse a dotx/docx file We have apache poi doing this with some effort like

XWPFDocument doc = new XWPFDocument(OPCPackage.open("-you docx/dotx file-path-"));

With this you can load an existing file. Now to parse over file you get

XWPFParagraph
XWPFTable

You can parse over both like this

for (XWPFParagraph p : doc.getParagraphs()) {
                List<XWPFRun> runs = p.getRuns();
                if (runs != null) {
                    for (XWPFRun r : runs) {
                        String text = r.getText(0);
                        if (text != null && text.contains("$$key$$")) {
                            text = text.replace("<asdas>", "ABCD");// your content
                            r.setText(text, 0);
                        }
                    }
                }
            }

To parse over table

for (XWPFTable tbl : doc.getTables()) {
                for (XWPFTableRow row : tbl.getRows()) {
                    for (XWPFTableCell cell : row.getTableCells()) {
                        for (XWPFParagraph p : cell.getParagraphs()) {
                            for (XWPFRun r : p.getRuns()) {
                                String text = r.getText(0);
                                if (text != null && text.contains("$$key$$")) {
                                    text = text.replace("<asdas>", "abcd");
                                    r.setText(text, 0);
                                }
                            }
                        }
                    }
                }
            }

Now to write the the parsed file in a target you get

doc.write(new FileOutputStream("-taget-path-"));

This needs all dependencies with apache POI like

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>3.17</version>
        </dependency>

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>3.17</version>
        </dependency>

You will need some more into your build path check your exceptions and add.

You can use this link and explore more

http://poi.apache.org/apidocs/dev/org/apache/poi/xwpf/usermodel/XWPFRun.html#setText%28java.lang.String%29

KishanCS
  • 1,357
  • 1
  • 19
  • 38
  • I just tried this code, but when i open the output file i get this error " we are sorry we couldn't open your test1 because we found a problem with content " – youssef elhayani Feb 08 '19 at 11:52
  • Check the file content first then check what you are writting you should read docx/dotx and write it as docx – KishanCS Feb 08 '19 at 12:14
  • @Kishan C S: You cannot read `*.dotx` and write that as `*.docx` without changing `[Content_Types].xml`. `Libreoffice` or `OpenOffice` `Writer` will tolerate wrong content-types but `Microsoft Word` will not. – Axel Richter Feb 08 '19 at 12:19
  • Then he has to update the content.xml by opening the file.Thank you Axel Richter – KishanCS Feb 08 '19 at 12:23
  • Yes indeed, i should Check the file content first, thank you @KishanCS. – youssef elhayani Feb 10 '19 at 10:04