Guides10 min read

HTML to PDF Conversion: The Complete Developer Guide (2026)

Everything you need to know about converting HTML to PDF in 2026 - from browser-based rendering to API services, with working Node.js and Python examples.

A

Alex Morgan

Author

Published
Updated
HTML to PDF Conversion: The Complete Developer Guide (2026)

HTML-to-PDF conversion is one of those tasks that seems simple until you actually do it. Generating invoices, reports, certificates, or export documents from web content is a common requirement - and there are many ways to do it, each with different trade-offs.

This guide covers every major approach in 2026, with working code you can copy today.

Why HTML to PDF Is Harder Than It Looks

PDF is a fixed-layout format. HTML is a fluid, dynamic format. Bridging the two means dealing with:

  • CSS support - not all PDF renderers support modern CSS (flexbox, grid, custom properties).
  • Page breaks - controlling where content splits across pages.
  • Fonts - embedding custom fonts without blowup or missing glyphs.
  • JavaScript - rendered data, charts, or tables built with JS need a real browser to evaluate.
  • Headers and footers - page numbers, logos, and document metadata.

Approach 1: wkhtmltopdf (The Old Standard)

wkhtmltopdf uses an older WebKit engine. It's fast and simple but has poor support for modern CSS. If your HTML uses flexbox or CSS Grid, output may be broken.

# Install on Ubuntu
apt-get install wkhtmltopdf

# Convert a URL to PDF
wkhtmltopdf https://example.com output.pdf

# Convert local HTML
wkhtmltopdf input.html output.pdf

Verdict: Only suitable for simple documents. Avoid for anything styled with modern CSS.

Approach 2: Puppeteer (Node.js)

Puppeteer drives a real Chromium browser, which means it handles all modern CSS, web fonts, and JavaScript. It's the most popular open-source option for Node.js.

const puppeteer = require('puppeteer');

async function htmlToPdf(htmlContent, outputPath) {
  const browser = await puppeteer.launch({ args: ['--no-sandbox'] });
  const page    = await browser.newPage();

  await page.setContent(htmlContent, { waitUntil: 'networkidle0' });

  await page.pdf({
    path:   outputPath,
    format: 'A4',
    margin: { top: '20mm', right: '15mm', bottom: '20mm', left: '15mm' },
    printBackground: true,
  });

  await browser.close();
}

const html = `
  <!DOCTYPE html>
  <html>
  <body style="font-family: sans-serif; padding: 40px;">
    <h1>Invoice #1042</h1>
    <p>Amount due: $250.00</p>
  </body>
  </html>
`;

htmlToPdf(html, 'invoice.pdf');

Pros: Full CSS support, handles JS. Cons: Requires running a Chromium binary - heavy dependency, hard to scale on serverless.

Approach 3: Playwright (Node.js / Python)

Playwright is similar to Puppeteer but supports Chromium, Firefox, and WebKit. The PDF API is nearly identical:

from playwright.sync_api import sync_playwright

def html_to_pdf(html_content: str, output_path: str) -> None:
    with sync_playwright() as p:
        browser = p.chromium.launch()
        page = browser.new_page()
        page.set_content(html_content, wait_until="networkidle")
        page.pdf(
            path=output_path,
            format="A4",
            margin={"top": "20mm", "right": "15mm", "bottom": "20mm", "left": "15mm"},
            print_background=True,
        )
        browser.close()

html = """
<!DOCTYPE html>
<html>
<body style="font-family: sans-serif; padding: 40px;">
  <h1>Invoice #1042</h1>
  <p>Amount due: $250.00</p>
</body>
</html>
"""

html_to_pdf(html, "invoice.pdf")

Approach 4: Screenshot API (The Scalable Option)

Running Chromium in your own infrastructure has real costs: binary management, memory spikes, concurrency limits, and cold starts on serverless. A screenshot API offloads all of that.

// Node.js - convert a URL to PDF
const axios = require('axios');
const fs    = require('fs');

async function urlToPdf(url, outputPath) {
  const response = await axios.get('https://api.screenshotcore.com/v1/screenshot', {
    headers: { Authorization: 'Bearer YOUR_API_KEY' },
    params:  { url, format: 'pdf', full_page: true },
    responseType: 'arraybuffer',
  });
  fs.writeFileSync(outputPath, response.data);
}

urlToPdf('https://yourapp.com/invoices/1042', 'invoice-1042.pdf');
# Python - convert a URL to PDF
import requests

def url_to_pdf(url: str, output_path: str) -> None:
    response = requests.get(
        "https://api.screenshotcore.com/v1/screenshot",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        params={"url": url, "format": "pdf", "full_page": True},
    )
    response.raise_for_status()
    with open(output_path, "wb") as f:
        f.write(response.content)

url_to_pdf("https://yourapp.com/invoices/1042", "invoice-1042.pdf")

Pros: No server maintenance, instant scale, pay per use. Cons: Page must be publicly accessible (or you use the HTML body parameter for private content).

Controlling Page Breaks with CSS

Regardless of which approach you use, CSS page-break properties control how your document splits across pages:

/* Force a new page before this element */
.page-break { page-break-before: always; }

/* Prevent a table row from splitting across pages */
tr { page-break-inside: avoid; }

/* Keep heading with the paragraph below it */
h2, h3 { page-break-after: avoid; }

Headers and Footers with Page Numbers

In Puppeteer and Playwright, you can inject header/footer HTML with the page.pdf() call:

await page.pdf({
  path:            'report.pdf',
  format:          'A4',
  displayHeaderFooter: true,
  headerTemplate: '<div style="font-size:10px;text-align:center;width:100%;">Monthly Report</div>',
  footerTemplate: '<div style="font-size:10px;text-align:center;width:100%;">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>',
  margin: { top: '30mm', bottom: '20mm' },
});

Which Approach Should You Use?

ApproachCSS SupportScalabilitySetup CostBest For
wkhtmltopdfPoorMediumLowSimple docs, legacy projects
PuppeteerExcellentLowMediumFull control, own infra
PlaywrightExcellentLowMediumMulti-browser, own infra
Screenshot APIExcellentHighNoneServerless, SaaS, scale

Summary

For most SaaS applications generating invoices, reports, or exports, a screenshot API is the fastest path to production-quality PDFs without managing browser infrastructure. For teams that need full control - custom headers, complex JS, or offline generation - Puppeteer or Playwright are the right tools.

Start generating PDFs free with ScreenshotCore - no credit card, 100 requests/month included.

#html to pdf#pdf generation#node.js#python#api

Related Articles