1

I am trying to convert .docx files to .pdf files using Unoconv. Libreoffice is installed on my server and the script works for another website on the server.

Using the line use Unoconv\Unoconv; results in an HTTP ERROR 500.

Does someone know why I get a HTTP ERROR 500?

Here is my script:

<?php
    require './Unoconv.php';
    use Unoconv\Unoconv;
        
    $originFilePath = './uf/invoice/17/word/202100021.docx';
    $outputDirPath  = './uf/invoice/17/pdf/202100021.pdf';
    
    Unoconv::convertToPdf($originFilePath, $outputDirPath);

    header("Content-type:application/pdf");
    header("Content-Disposition:attachment;filename=202100021.pdf");
?>

Here is my Unoconv.php script:

<?php

namespace Unoconv;

class Unoconv {

    public static function convert($originFilePath, $outputDirPath, $toFormat)
    {
        $command = 'unoconv --format %s --output %s %s';
        $command = sprintf($command, $toFormat, $outputDirPath, $originFilePath);
        system($command, $output);

        return $output;
    }

    public static function convertToPdf($originFilePath, $outputDirPath)
    {
        return self::convert($originFilePath, $outputDirPath, 'pdf');
    }

    public static function convertToTxt($originFilePath, $outputDirPath)
    {
        return self::convert($originFilePath, $outputDirPath, 'txt');
    }

}
?>
John
  • 904
  • 8
  • 22
  • 56

3 Answers3

2

Start from wrapping your code with try...catch to get the error message first:

<?php
try {
    require 'Unoconv.php';
    use Unoconv\Unoconv;
    
    $map1 = $_SESSION['companyid'];
    $filename = $result1['filename'];
    
    $originFilePath = './uf/doc/'.$map1.'/word/'.$filename.'.docx';
    $outputDirPath  = './uf/doc/'.$map1.'/pdf/'.$filename.'.pdf';
    
    Unoconv::convertToPdf($originFilePath, $outputDirPath);
    
    header("Content-type:application/pdf");
    header("Content-Disposition:attachment;filename=".$filename.".pdf");
    readfile($outputDirPath);
} catch (\Exception $e) {
    die($e->getMessage());
}
Alex
  • 16,739
  • 1
  • 28
  • 51
  • @John that is kind impossible. Try `die('test')` at the very 1st line in your script. It looks like the problem is outside of your script if we can't `catch` it. – Alex Sep 13 '21 at 17:14
  • `die('test')` did not change anything. I still get a Error 500. The script in my first post is exactly what I am using. Nothing more, nothing less. – John Sep 13 '21 at 20:16
  • @John that is impossible. delete all content and keep only `die('test');` it seem you never get to this script at all. – Alex Sep 13 '21 at 20:20
  • But it is happening :). With only `die('test');` I get the text `test` in my browser :) – John Sep 13 '21 at 21:31
  • @John as per Noelpotnic comment I've adjusted my code. Try it again to detect the real error message that breaks your page. – Alex Sep 14 '21 at 13:14
  • I still get a Error 500. When I run the script with try & catch. But I have a little update. When I run the script without try & catch my browser generates an empty pdf file. It downloads the file 202100021.pdf without any content. I also can see that the folder is empty, there are no files converted. – John Sep 14 '21 at 22:27
  • @John it is either 500 errror or file is pushed. If you say that file is pushed to browser - I doubt that you have 500 error. It seem you've missed the part that push the real content `readfile(...)` – Alex Sep 15 '21 at 01:34
  • I see a error 500 page in Chrome. In Firefox I see a blank page. But in the console I can see that there is a Error 500. There is something with the `use Unoconv\Unoconv;` line. With this line I see a Error 500 and it does not generate a PDF file – John Sep 15 '21 at 13:52
  • `try... catch` should output some error message – Alex Sep 15 '21 at 13:58
  • @John join zoom meeting and share your screen I can try to help if you are online now https://us02web.zoom.us/j/89783185489?pwd=SnRXZFpUUk5tcHhESVFHUldCbHNYdz09 – Alex Sep 15 '21 at 14:02
2

@Alex is correct about wrapping in try/catch first, but should the syntax be:

...
} catch(\Exception $e){
...
Noelpotnic
  • 31
  • 1
2

I've observed that LibreOffice can be a little quirky when doing conversions, especially when running in headless mode from a webserver account.

The simplest thing to try is to modify unoconv to use the same Python binary that is shipped with LibreOffice:

#!/usr/bin/env python

should be (after checking where libreoffice is installed)

#!/opt/libreoffice7.1/program/python

Otherwise, I have worked around the problem by invoking libreoffice directly (without Unoconv):

    $dir    = dirname($docfile);
    // Libreoffice saves here
    $pdf    = $dir . DIRECTORY_SEPARATOR . basename($docfile, '.docx').'.pdf';
    $ret = shell_exec("export HOME={$dir} && /usr/bin/libreoffice --headless --convert-to pdf --outdir '{$dir}' '{$docfile}' 2>&1");
    if (file_exists($pdf)) {
        rename($pdf, $realPDFName);
    } else {
        return false;
    }
    return true;

Note the export HOME={$dir} directive, to ensure that temporary lock files will be saved in the current directory where, presumably, the web server has full permissions. If this requirement isn't met, LibreOffice will silently fail (or at least, it will fail - that much I observed - and I haven't been able to locate an error message anywhere - I found out what was going on through the use of strace).

So your code would become:

$originFilePath = './uf/invoice/17/word/202100021.docx';
$outputDirPath  = './uf/invoice/17/pdf/202100021.pdf';

$dir    = dirname($originFilePath);
$pdf    = $dir . DIRECTORY_SEPARATOR . basename($originFilePath, '.docx').'.pdf';
$ret = shell_exec("export HOME={$dir} && /usr/bin/libreoffice --headless --convert-to pdf --outdir '{$dir}' '{$originFilePath}' 2>&1");
// $ret will contain any errors
if (!file_exists($pdf)) {
    die("Conversion error: " . htmlentities($ret));
}
rename($pdf, $outputDirPath);

header("Content-type:application/pdf");
header("Content-Disposition:attachment;filename=202100021.pdf");
readfile($outputDirPath);

I assume that libreoffice is present in the usual alternatives link of "/usr/bin/libreoffice", otherwise you need to retrieve its path with the terminal command of "which libreoffice". Or, from a php script,

<?php
header('Content-Type: text/plain');
print "If this works:\n";
system('which libreoffice 2>&1');
print "\n-- otherwise a different attempt, returning too much information --\n";
system('locate libreoffice');
LSerni
  • 55,617
  • 10
  • 65
  • 107