How to Create PDF Files with Python

Articles From: TheAutomatic.net
Website: TheAutomatic.net

By:

Blogger, TheAutomatic.net, and Senior Data Scientist

In a previous article we talked about several ways to read PDF files with Python. This post will cover two packages used to create PDF files with Python, including pdfkit and ReportLab.

Create PDF files with Python and pdfkit

pdfkit was the first library I learned for creating PDF files. A nice feature of pdfkit is that you can use it to create PDF files from URLs. To get started, you’ll need to install it along with a utility called wkhtmltopdf. Use pip to install pdfkit from PyPI:

pip install pdfkit

Once you’re set up, you can start using pdfkit. In the example below, we download Wikipedia’s main page as a PDF file. To get pdfkit working, you’ll need to either add wkhtmltopdf to your PATH, or configure pdfkit to point to where the executable is stored (the latter option is used below).

Download a webpage as a PDF

# import package
import pdfkit
 
# configure pdfkit to point to our installation of wkhtmltopdf
config = pdfkit.configuration(wkhtmltopdf = r"C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe")
 
# download Wikipedia main page as a PDF file
pdfkit.from_url("https://en.wikipedia.org/wiki/Main_Page", "sample_url_pdf.pdf", configuration = config)

You can also set the output path to False, which will return a binary version of the PDF into Python, rather than downloading the webpage to an external file.

pdfkit.from_url("https://en.wikipedia.org/wiki/Main_Page", output_path = False, configuration = config)

How to create a PDF from HTML

One of the nicest features of pdfkit is that you can use it to create PDF files from HTML, including from HTML strings that you pass it directly in Python.

 = """<h1><strong>Sample PDF file from HTML</strong></h1>
       <br></br>
       <p>First line...</p>
       <p>Second line...</p>
       <p>Third line...</p>"""
 
pdfkit.from_string(s, output_path = "new_file.pdf", configuration = config)

Additionally, pdfkit can create PDF files by reading HTML files.

	
pdfkit.from_file("sample_html_file.html", output_path = "new_file2.pdf", configuration = config)

You can also create PDF files with more complex HTML / CSS, as well. You simply need to pass the HTML as a string or store it in a file that can be passed to pdfkit. Let’s do another example, but this time, we’ll create a table using HTML and CSS.

Creating tables in a PDF file

table_html = """<!DOCTYPE html>
<html>
<head>
<style>
table, th, td {
  border: 1px solid black;
}
 
table {
  width: 100%;
}
</style>
</head>
<body>
 
<h2>Sample Table</h2>
 
<table>
  <tr>
    <th>Field 1</th>
    <th>Field 2</th>
  </tr>
  <tr>
    <td>x1</td>
    <td>x2</td>
  </tr>
  <tr>
    <td>x3</td>
    <td>x4</td>
  </tr>
</table>
 
</body>
</html>
 """
 
pdfkit.from_string(table_html, output_path = "sample_table.pdf", configuration = config)

Creating PDF files with Python and ReportLab

The next package we’ll discuss is ReportLabReportLab is one of the most popular libaries for creating PDF files.

You can install ReportLab using pip:

pip install reportlab

Here’s an initial example to create a simple PDF with one line of text. The first piece of code imports the canvas module from ReportLab. Then, we create an instance of the Canvas (note the capital “C” this time) class with the name of the file we want to create. Third, we use drawString to write out a line of text. The (50, 800) are coordinates for where to place the text (this might take some experimentation). Lastly, we save the file.

from reportlab.pdfgen import canvas
 
report = canvas.Canvas("first_test.pdf")
 
report.drawString(50, 800, "**First PDF with ReportLab**")
report.save()

Adding images to a PDF file

Next, let’s create a sample PDF file containing an image. Here, we’re going to use the pillow library to create an Image object. In this example, we need to create a list of elements that we will use to construct the PDF file (we refer to this list as info below). For this instance, the list will contain just one element – the Image object represeting the image that we will put into the PDF file, but as we’ll see in the next example, we can also use this list to store other elements for placing into the PDF file.

Also, note here we are using the SimpleDocTemplate class, which basically does what it sounds like – creates a simple document template that we can use to fill in information. This provides more structure than using canvas, like above.

# import in SimpleDocTemplate
from reportlab.platypus import SimpleDocTemplate
from PIL import Image
 
# create document object
doc = SimpleDocTemplate("sample_image.pdf")
info = []
 
# directory to image file we want to use
image_file = "sample_plot.png"
 
# create Image object with size specifications
im = Image(image_file, 3*inch, 3*inch)
 
# append Image object to our info list
info.append(im)
 
# build / save PDF document
doc.build(info)

Creating paragraphs of text

Generalizing on our code above, we can add a few paragraphs of text, followed by a sample image.

from reportlab.platypus import Paragraph
 
doc = SimpleDocTemplate("more_text.pdf")
 
p1 = "<font size = '12'><strong>This is the first paragraph...</strong></font>"
p2 = "<font size = '12'><strong>This is the second paragraph...</strong></font>"
p3 = "<font size = '12'><strong>This is the third paragraph...</strong></font>"
p4 = "<br></br><br></br><br></br>"
 
image_file = "sample_plot.png"
 
im = Image(image_file, 3*inch, 3*inch)
 
info = []
 
info.append(Paragraph(p1))
info.append(Paragraph(p2))
info.append(Paragraph(p3))
info.append(Paragraph(p4))
info.append(im)
 
doc.build(info)

How to adjust fonts

To adjust font types, we can tweak our first ReportLab example above to use the setFont method.

from reportlab.pdfgen import canvas
 
report = canvas.Canvas("test_with_font.pdf")
 
report.setFont("Courier", 12)
 
report.drawString(50, 800, "**Test PDF with Different Font**")
report.save()

Creating a PDF with multiple pages

Next, let’s show how to create a PDF with multiple pages. This is a common and useful task to be able to do. To handle creating multiple pages, we’ll modify the above example to create a PDF with three separate pages. One way to tell ReportLab the content on a single page is finished is to use the showPage method, like below. Any content you create afterward will be added to the next page. Then, we can call the showPage method again to create a third page.

from reportlab.pdfgen import canvas
 
report = canvas.Canvas("multiple_pages.pdf")
report.setFont("Courier", 12)
 
report.drawString(50, 800, "**This is the first page...**")
report.showPage()
 
report.drawString(50, 800, "**This is the second page...**")
report.showPage()
 
report.drawString(50, 800, "**This is the third page...**")
report.showPage()
 
report.save()

Another way to create page breaks using the SimpleDocTemplate from earlier in the post is like this:

# import PageBreak, along with SimpleDocTemplate
from reportlab.platypus import SimpleDocTemplate, PageBreak
 
# create new file with image and multiple pages
doc = SimpleDocTemplate("sample_image_multiple_pages.pdf")
info = []
 
image_file = "sample_plot.png"
 
im = Image(image_file, 3*inch, 3*inch)
info.append(im)
 
# add page break 
info.append(PageBreak())
info.append(Paragraph("Second page..."))
 
# add third page
info.append(PageBreak())
info.append(Paragraph("Third page..."))
 
# build PDF
doc.build(info)

Visit TheAutomatic.net to learn more about this topic.

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from TheAutomatic.net and is being posted with its permission. The views expressed in this material are solely those of the author and/or TheAutomatic.net and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.