7

To generate PDF from a HTML file, I want to use selenium Chrome driver.

I tried it with command line :

chrome.exe --headless --disable-gpu --print-to-pdf   file:///C:invoiceTemplate2.html

and it works perfectly, So I wanted to do that with JAVA and here's my code :

System.setProperty("webdriver.chrome.driver", "C:/work/chromedriver.exe");
ChromeOptions options = new ChromeOptions();
options.addArguments("--headless", "--disable-gpu", "--print-to-pdf",
            "file:///C:/invoiceTemplate2.html");
WebDriver driver = new ChromeDriver(options);
driver.quit();

The server is started with no problem, but chrome is opened with multiple tabs with the arguments I specified in Options.

Any solution to this ? thx.

OddDev
  • 1,521
  • 7
  • 23
  • 45

4 Answers4

4

UPDATE 31-03-2023: In one of chrome's last updates some extra security measures were added and the solution bellow stoped working because the websocket connection could not be stablish. To fix this we added a new argument to ChromeDriver:

options.addArgument("--remote-allow-origins=*");

UPDATE 31-05-2021: we noticed that the original workaround was not always working properly, and we went for a Selenium + ChromeDriver:

public void generatePdf(Path inputPath, Path outputPath) throws Exception
{
    try
    {

        ChromeOptions options = new ChromeOptions();
        options.addArguments("--headless", "--disable-gpu", "--run-all-compositor-stages-before-draw");
        ChromeDriver chromeDriver = new ChromeDriver(options);
        chromeDriver.get(inputPath.toString());
        Map<String, Object> params = new HashMap();
        
        String command = "Page.printToPDF";
        Map<String, Object> output = chromeDriver.executeCdpCommand(command, params);

        try
        {
            FileOutputStream fileOutputStream = new FileOutputStream(outputPath.toString());
            byte[] byteArray = java.util.Base64.getDecoder().decode((String) output.get("data"));
            fileOutputStream.write(byteArray);
            fileOutputStream.close();
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
    catch (Exception e)
    {
        e.printStackTrace(System.err);
        throw e;
    }
}

If this will be called frequently I suggest reusing the driver object because it takes a while to initialize.

Remember to close or quit the driver to avoid leaving Zombie chrome processes behind and also remember to install ChromeDriver in your machine.


Original Solution:

Not being able to get the desired outcome using ChromeDriver my workaround was to call the headless chrome in the command-line from my Java program.

This is working on Windows but just changing the contents of the paths used in the command variable should make it work in Linux too.

public void generatePdf(Path inputPath, Path outputPath) throws Exception {

    try {
            
        String chromePath = "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe";
        String command = chromePath + " --headless --disable-gpu --run-all-compositor-stages-before-draw --print-to-pdf=" + outputPath.toString() + " " + inputPath.toString();
                
        // Runs "chrome" Windows command
        Process process = Runtime.getRuntime().exec(command);
        process.waitFor(); // Waits for the command's execution to finish 
            
    }catch (Exception e){
        
        e.printStackTrace(System.err);
        throw e;

    }finally{
        
        // Deletes files on exit
        input.toFile().deleteOnExit();
        output.toFile().deleteOnExit();

    }
}

Note: both input and output paths are temporary files created with NIO.

Caponte
  • 401
  • 1
  • 11
  • 20
  • Thanks - the "executeCdpCommand" version work great for me. If you want to set some additional parameter - here is the list of available ones: https://vanilla.aslushnikov.com/?Page.printToPDF – max Feb 03 '23 at 13:01
  • @KJ I looked into it, in my initial tests it seems to still work without that option. However, I did not find information about the deprecation. Could you share some references to have a better idea about it? – Caponte Apr 03 '23 at 12:53
3

This can indeed be done with Selenium and ChromeDriver (tested with Chrome version 85), but using the "print-to-pdf" option when starting Chrome from the webdriver is not the solution.

The thing to do is to use the command execution functionality of ChromeDriver:

https://www.selenium.dev/selenium/docs/api/java/org/openqa/selenium/remote/RemoteWebDriver.html#execute-java.lang.String-java.util.Map-

There is a command called Page.printToPDF that provides PDF output functionality. A dictionary containing the item "data", with the resulting PDF in base-64-encoded format, is returned.

Unfortunately, I do not have a full Java example, but in this answer, there is a C# example (Selenium methods are named differently in C# compared to Java, but the principle should be the same):

https://stackoverflow.com/a/63970792/2416627

The Page.printToPDF command in Chrome is documented here:

https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-printToPDF

Otto G
  • 670
  • 7
  • 14
  • THis is not efficient when you want to convert a very large html document to PDF. IT would be nice to only pass the source and destination file and let chrome handle the streaming instead of holding a base64 (30% size increase of original data) in memory – TheRealChx101 Mar 18 '23 at 21:27
  • 1
    Actually, the Page.printToPDF documentation at https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-printToPDF now lists an experimental option `transferMode="ReturnAsStream"` that should allow for precisely that, as far as output is concerned. (Not that Chrome would literally handle the streaming to the output file, but it should be possible to stream directly to a PDF file from the Java or C# code.) I have not tried that option, though. – Otto G Aug 28 '23 at 14:34
-2

The code will help you save the page in PDF format on Selenium c#

using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

    protected void PDFconversion(ChromeDriver driver, string root, string rootTemp)
    {
        //Grid.Rows.Add(TxtBxName.Text, TxtBxAddress.Text);
        try
        {
            IJavaScriptExecutor js = (IJavaScriptExecutor)driver;
            Thread.Sleep(500);
            js.ExecuteScript("setTimeout(function() { window.print(); }, 0);");
            Thread.Sleep(500);
            driver.SwitchTo().Window(driver.WindowHandles.Last());
            Thread.Sleep(500);
            string JSPath = "document.querySelector('body>print-preview-app').shadowRoot.querySelector('#sidebar').shadowRoot.querySelector('#destinationSettings').shadowRoot.querySelector('#destinationSelect').shadowRoot.querySelector('print-preview-settings-section:nth-child(9)>div>select>option:nth-child(3)')";
            Thread.Sleep(500);
            IWebElement PrintBtn = (IWebElement)js.ExecuteScript($"return {JSPath}");
            Thread.Sleep(500);
            PrintBtn.Click();
            string JSPath1 = "document.querySelector('body>print-preview-app').shadowRoot.querySelector('#sidebar').shadowRoot.querySelector('print-preview-button-strip').shadowRoot.querySelector('cr-button.action-button')";
            Thread.Sleep(1000);
            IWebElement PrintBtn1 = (IWebElement)js.ExecuteScript($"return {JSPath1}");
            PrintBtn1.Click();
            Thread.Sleep(1000);
            SendKeys.Send("{HOME}");
            SendKeys.Send(rootTemp + "\\" + "result.pdf"); // Path
            SendKeys.Send("{TAB}");
            SendKeys.Send("{TAB}");
            SendKeys.Send("{TAB}");
            SendKeys.Send("{ENTER}");
            Thread.Sleep(1000);
       
        }
        catch (Exception ex){}
    }
SternK
  • 11,649
  • 22
  • 32
  • 46
-5

You have to do two things.

First: Make a screenshot using selenium.

Second: Convert that screenshot using any pdf tool, like itext. Here I am showing a complete example of how to do this.

Step 1: Download the jar of itext from here and add the jar file to your build path.

Step 2: Add this code to your project.

ChromeOptions options = new ChromeOptions();
options.addArguments("disable-infobars");
options.addArguments("--print-to-pdf");

WebDriver driver = new ChromeDriver(options);
driver.get("file:///C:/invoiceTemplate2.html");

try {
    File screenshot = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
    FileUtils.copyFile(screenshot, new File("screenshot.png"));
    Document document = new Document(PageSize.A4, 20, 20, 20, 20);
    PdfWriter.getInstance(document, new FileOutputStream("webaspdf.pdf"));
    document.open();
    Image image = Image.getInstance("screenshot.png");
    document.add(image);
    document.close();
}
catch (Exception e2) {
    // TODO Auto-generated catch block
    e2.printStackTrace();
}

Note: To use the mentioned itext package, add the required imports to your code.

import com.itextpdf.text.Document;
import com.itextpdf.text.Image;
import com.itextpdf.text.PageSize;
import com.itextpdf.text.pdf.PdfWriter;
import org.apache.commons.io.FileUtils;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
Cellcon
  • 1,245
  • 2
  • 11
  • 27
Mahmud Riad
  • 1,169
  • 1
  • 8
  • 19