Python PDFKit Module: Convert HTML, URL, and Text to PDFs

Generating PDFs In Python

Did you know that generating PDFs using Python is so easy and simple? This is possible because of the PDFkit module in Python. We know that PDF ensures documents look the same on any device, making it a widely used and reliable format for sharing information. Hence, conversion into PDF is very necessary. So, in this article, let us explore how we can convert webpages, URLs, and textual formats into a PDF format.

Installing and Setting Up PDFKit in Python

Before using the module, we need to install it on our systems. To achieve the same we can use the following command:

pip install pdfkit

Installation

Along with the pdfkit module, we need to install wkhtmltopdf, a free tool that converts HTML to PDF and various image formats using the Qt WebKit rendering engine.

To install wkhtmltopdf in Windows:

Environment Variable

To install wkhtmltopdf in Ubuntu follow the given command:

sudo apt-get install wkhtmltopdf

To install wkhtmltopdf in macOS follow the given command:

brew install homebrew/cask/wkhtmltopdf

Converting to PDF with PDFKit in Python

Now that we are done with our setup, let’s see what types of PDF conversions are possible:

3 ways of conversion

Let’s discuss each of the conversions one by one with code examples.

1. Converting HTML to PDF

We use the ‘.from_file’ to convert a file (HTML) format to PDF. Converting HTML to PDF simplifies and ensures easy access, making your document viewable on any platform with consistency.

Example Code:

//index.html     Demo  

Python Programming Language

Python is a high-level, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation. Python is dynamically typed and garbage-collected. It supports multiple programming paradigms, including structured, object-oriented and functional programming.

import pdfkit config = pdfkit.configuration(wkhtmltopdf='C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe') pdfkit.from_file('index.html', 'converted.pdf', configuration=config)

Working:

Output:

A new PDF file is created in the same directory once we run the program.

Fs Before Fs After Converted Pdf

2. Converting URL to PDF

We use the ‘.from_url’ to convert a URL or web page to PDF. Turning a URL into a PDF helps save web content for offline use, ensures consistent formatting, and makes sharing information straightforward.

Example Code:

import pdfkit config = pdfkit.configuration(wkhtmltopdf='C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe') pdfkit.from_url('https://www.visitgreece.gr/islands/cyclades/santorini/', 'converted2.pdf', configuration=config)

Working:

Output:

The program creates a new PDF file in the current directory when it runs.

After Conversion Converted Pdf

3. Converting Text to PDF

We use the ‘.from_string’ to convert a string (Textual) format to PDF. Turning Text into PDF ensures easy sharing and consistent viewing for universal accessibility.

Example Code:

import pdfkit config = pdfkit.configuration(wkhtmltopdf='C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe') pdfkit.from_file('Hello and welcome to CodeForGeek', 'converted3.pdf', configuration=config)

Working:

Output:

When the program runs, a new PDF file is generated in the same directory.

After Conversion Converted Pdf

Conclusion

So, that’s it for this article. I hope you are clear about the three types of PDF conversions using the Python PDFKit module. It helps us convert webpages to PDFs, ensuring offline access with preserved layout and content integrity. Additionally, it transforms text content into PDFs, facilitating easy sharing, printing, and standardized document presentation. Furthermore, it allows us to save entire websites as PDFs, providing comprehensive snapshots for archiving, reference, and offline viewing.

Further Reading: