HTML-to-PDF conversion is one of those tasks that seems simple until you actually do it. Generating invoices, reports, certificates, or export documents from web content is a common requirement - and there are many ways to do it, each with different trade-offs.
This guide covers every major approach in 2026, with working code you can copy today.
Why HTML to PDF Is Harder Than It Looks
PDF is a fixed-layout format. HTML is a fluid, dynamic format. Bridging the two means dealing with:
- CSS support - not all PDF renderers support modern CSS (flexbox, grid, custom properties).
- Page breaks - controlling where content splits across pages.
- Fonts - embedding custom fonts without blowup or missing glyphs.
- JavaScript - rendered data, charts, or tables built with JS need a real browser to evaluate.
- Headers and footers - page numbers, logos, and document metadata.
Approach 1: wkhtmltopdf (The Old Standard)
wkhtmltopdf uses an older WebKit engine. It's fast and simple but has poor support for modern CSS. If your HTML uses flexbox or CSS Grid, output may be broken.
# Install on Ubuntu
apt-get install wkhtmltopdf
# Convert a URL to PDF
wkhtmltopdf https://example.com output.pdf
# Convert local HTML
wkhtmltopdf input.html output.pdf
Verdict: Only suitable for simple documents. Avoid for anything styled with modern CSS.
Approach 2: Puppeteer (Node.js)
Puppeteer drives a real Chromium browser, which means it handles all modern CSS, web fonts, and JavaScript. It's the most popular open-source option for Node.js.
const puppeteer = require('puppeteer');
async function htmlToPdf(htmlContent, outputPath) {
const browser = await puppeteer.launch({ args: ['--no-sandbox'] });
const page = await browser.newPage();
await page.setContent(htmlContent, { waitUntil: 'networkidle0' });
await page.pdf({
path: outputPath,
format: 'A4',
margin: { top: '20mm', right: '15mm', bottom: '20mm', left: '15mm' },
printBackground: true,
});
await browser.close();
}
const html = `
<!DOCTYPE html>
<html>
<body style="font-family: sans-serif; padding: 40px;">
<h1>Invoice #1042</h1>
<p>Amount due: $250.00</p>
</body>
</html>
`;
htmlToPdf(html, 'invoice.pdf');
Pros: Full CSS support, handles JS. Cons: Requires running a Chromium binary - heavy dependency, hard to scale on serverless.
Approach 3: Playwright (Node.js / Python)
Playwright is similar to Puppeteer but supports Chromium, Firefox, and WebKit. The PDF API is nearly identical:
from playwright.sync_api import sync_playwright
def html_to_pdf(html_content: str, output_path: str) -> None:
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.set_content(html_content, wait_until="networkidle")
page.pdf(
path=output_path,
format="A4",
margin={"top": "20mm", "right": "15mm", "bottom": "20mm", "left": "15mm"},
print_background=True,
)
browser.close()
html = """
<!DOCTYPE html>
<html>
<body style="font-family: sans-serif; padding: 40px;">
<h1>Invoice #1042</h1>
<p>Amount due: $250.00</p>
</body>
</html>
"""
html_to_pdf(html, "invoice.pdf")
Approach 4: Screenshot API (The Scalable Option)
Running Chromium in your own infrastructure has real costs: binary management, memory spikes, concurrency limits, and cold starts on serverless. A screenshot API offloads all of that.
// Node.js - convert a URL to PDF
const axios = require('axios');
const fs = require('fs');
async function urlToPdf(url, outputPath) {
const response = await axios.get('https://api.screenshotcore.com/v1/screenshot', {
headers: { Authorization: 'Bearer YOUR_API_KEY' },
params: { url, format: 'pdf', full_page: true },
responseType: 'arraybuffer',
});
fs.writeFileSync(outputPath, response.data);
}
urlToPdf('https://yourapp.com/invoices/1042', 'invoice-1042.pdf');
# Python - convert a URL to PDF
import requests
def url_to_pdf(url: str, output_path: str) -> None:
response = requests.get(
"https://api.screenshotcore.com/v1/screenshot",
headers={"Authorization": "Bearer YOUR_API_KEY"},
params={"url": url, "format": "pdf", "full_page": True},
)
response.raise_for_status()
with open(output_path, "wb") as f:
f.write(response.content)
url_to_pdf("https://yourapp.com/invoices/1042", "invoice-1042.pdf")
Pros: No server maintenance, instant scale, pay per use. Cons: Page must be publicly accessible (or you use the HTML body parameter for private content).
Controlling Page Breaks with CSS
Regardless of which approach you use, CSS page-break properties control how your document splits across pages:
/* Force a new page before this element */
.page-break { page-break-before: always; }
/* Prevent a table row from splitting across pages */
tr { page-break-inside: avoid; }
/* Keep heading with the paragraph below it */
h2, h3 { page-break-after: avoid; }
Headers and Footers with Page Numbers
In Puppeteer and Playwright, you can inject header/footer HTML with the page.pdf() call:
await page.pdf({
path: 'report.pdf',
format: 'A4',
displayHeaderFooter: true,
headerTemplate: '<div style="font-size:10px;text-align:center;width:100%;">Monthly Report</div>',
footerTemplate: '<div style="font-size:10px;text-align:center;width:100%;">Page <span class="pageNumber"></span> of <span class="totalPages"></span></div>',
margin: { top: '30mm', bottom: '20mm' },
});
Which Approach Should You Use?
| Approach | CSS Support | Scalability | Setup Cost | Best For |
|---|---|---|---|---|
| wkhtmltopdf | Poor | Medium | Low | Simple docs, legacy projects |
| Puppeteer | Excellent | Low | Medium | Full control, own infra |
| Playwright | Excellent | Low | Medium | Multi-browser, own infra |
| Screenshot API | Excellent | High | None | Serverless, SaaS, scale |
Summary
For most SaaS applications generating invoices, reports, or exports, a screenshot API is the fastest path to production-quality PDFs without managing browser infrastructure. For teams that need full control - custom headers, complex JS, or offline generation - Puppeteer or Playwright are the right tools.
Start generating PDFs free with ScreenshotCore - no credit card, 100 requests/month included.
