Launching a headless browser just to generate some PDFs. Turns out, if you want ...

doix · on Dec 23, 2023

I did the same. We had a tool that would let you export to pdf. That pdf would be sent to our customers. Initially we just used the print functionality in the users browser, but that caused output to vary based on the browser/os used.

People complained that the PDFs generated were slightly different. So instead I had the client send over the entire html in a post request and open it up in a headless chrome with --print-to-pdf and then sent it back to the client.

quyse · on Dec 22, 2023

I've implemented recently just the same thing, but for SVG -> PNG conversion. I found that SVG rendering support is crap in every conversion tool and library I've tried. Apparently even Chrome has some basic features missing, when doing text on path for example. So far Selenium + headless Firefox performs the best ¯\_(ツ)_/¯

unnouinceput · on Dec 23, 2023

I had Chromium component added to a project just to show the users the help file which was a giant PDF document. The PDF file was from a 3rd part vendor who didn't know better/refused to change the system so we had to show it "as is" to the users. Any PDF reader component we tried failed because the PDF file had some crappy features in it that none of those component knew how to parse. Chromium engine, for its hate that gets nowadays, had no problem with any of those PDF files.

vgalin · on Dec 22, 2023

I wrote a Python package [1] that does something similar! It allows the generation of images from HTML+CSS strings or files (or even other files like SVGs) and could probably handle PDF generation too. It uses the headless version of Chrome/Chromium or Edge behind the scenes.

Writing this package made me realize that even big projects (such as Chromium) sometimes have features that just don't work. Edge headless wouldn't let you take screenshots up until recently, and I still encountered issues with Firefox last time I tried to add support for it in the package. I also stumbled upon weird behaviors of Chrome CDP when trying to implement an alternative to using the headless mode, and these issues eventually fixed themselves after some Chrome updates.

[1] https://github.com/vgalin/html2image

rudasn · on Dec 22, 2023

Yeah it's the same concept, instead of .screenshot you do .pdf in pupetteer.

But with pdfs the money is on getting those headers and footers consistent and on every page, so you do need some handcrafted html and print styling for that (hint: the answer is tables).

i386 · on Dec 23, 2023

This is how we exported designs at Canva. It works!

foul · on Dec 22, 2023

I've seen a bit of SaaS and legacy websites-with-invoice-system doing that, with e.g. wkhtmltopdf. It isn't a lightweight solution, but it's a good hammer for a strange nail, a lot of off-the-shelf report systems suck.

phanimahesh · on Dec 23, 2023

This also happens to be the easiest path. There are other options but no good ones

polishdude20 · on Dec 23, 2023

We did that at the previous place I worked!

im3w1l · on Dec 23, 2023

I mean browsers are built for and the best at displaying html+css. Given that they are "living standards", very few other programs can hope to keep up.