2

I'm using pdfkit with jupyter notebook on windows 10 without problem.

Now I need to use it on Google Colab and don't even know where to start.

I tried installing the linux package into the "/users/contents"

! wget "https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6-3/wkhtmltox-0.12.6-3.archlinux-x86_64.pkg.tar.xz" && \
tar vxfJ "wkhtmltox-0.12.6-3.archlinux-x86_64.pkg.tar.xz" && \
mv wkhtmltox/bin/wkhtmltopdf /usr/bin/wkhtmltopdf

It installed correctly.

I tried to make pdfkit point to the path "/content/usr/bin/wkhtmltopdf" but didn't work.

!pip install pdfkit
import pdfkit
path_wkhtmltopdf = "/content/usr/bin/wkhtmltopdf"
config = pdfkit.configuration(wkhtmltopdf=path_wkhtmltopdf)
pdfkit.from_url("test.html", "out.pdf", configuration=config)

The error returned:

Collecting pdfkit
  Downloading https://files.pythonhosted.org/packages/57/da/48fdd627794cde49f4ee7854d219f3a65714069b722b8d0e3599cd066185/pdfkit-0.6.1-py3-none-any.whl
Installing collected packages: pdfkit
Successfully installed pdfkit-0.6.1
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-9-b691bc3e9060> in <module>()
      3 path_wkhtmltopdf = "/content/usr/bin/wkhtmltopdf"
      4 config = pdfkit.configuration(wkhtmltopdf=path_wkhtmltopdf)
----> 5 pdfkit.from_url("yggtorrent1.html", "out.pdf", configuration=config)

1 frames
/usr/local/lib/python3.7/dist-packages/pdfkit/pdfkit.py in to_pdf(self, path)
    157 
    158         if exit_code != 0:
--> 159             raise IOError("wkhtmltopdf exited with non-zero code {0}. error:\n{1}".format(exit_code, stderr))
    160 
    161         # Since wkhtmltopdf sends its output to stderr we will capture it

OSError: wkhtmltopdf exited with non-zero code 1. error:
/content/usr/bin/wkhtmltopdf: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /content/usr/bin/wkhtmltopdf)
/content/usr/bin/wkhtmltopdf: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.28' not found (required by /content/usr/bin/wkhtmltopdf)

2 Answers2

0

#This worked for me

#First you check your operating system

!cat /etc/os-release

#And check and check processor architecture with !uname -m

Mine is ( get Ubuntu x86_64)

#Now you install the dependecies

!pip install pdfkit

!wget https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6-1/wkhtmltox_0.12.6-1.bionic_amd64.deb

!cp wkhtmltox_0.12.6-1.bionic_amd64.deb /usr/bin

!sudo apt install /usr/bin/wkhtmltox_0.12.6-1.bionic_amd64.deb

#pdfkit should work now

pdfkit.from_url("test.html", "out.pdf")

source:How to set wkhtmltoimage path when using imgkit in google colab?

Enam Akli
  • 1
  • 2
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Jeremy Caney Sep 09 '22 at 00:31
0

A simple example of the code block (pdfkit + requests) works fine in Google Colab.

!pip install pdfkit
!pip install requests

import pdfkit
import requests

# Function to generate a PDF from a URL
def generate_pdf_from_url(url, output_file):
    try:
        response = requests.get(url)
        response.raise_for_status()  # Raise an exception if there's an HTTP error

        # Use pdfkit to generate PDF from HTML content
        pdfkit.from_string(response.text, output_file)
        print("PDF generated successfully!")
    except requests.exceptions.RequestException as e:
        print("Error fetching URL:", str(e))
    except Exception as ex:
        print("Error generating PDF:", str(ex))

# Example usage:
url = "https://www.example.com"  # Replace with your desired URL
output_file = "output.pdf"  # Output file name

generate_pdf_from_url(url, output_file)