Title

Wednesday, 21 January 2015

Display PDF as Image in HTML


What is the best way to display a pdf in your browser using HTML?

The main goal here is to display the pdf as if it was a normal image. The user of the website uploads a pdf file, this file gets separated in his different pages using PyPDF and then each page should be displayed on the webpage. So with each page as a different image.

This would be ideal:

<img src="test_page1.pdf">  <img src="test_page2.pdf">

I have been trying to do this by using ImageMagick (Python) to convert my pdf's to images however this causes some difficulties:

  • Crashes sometimes
  • The quality of the image severely decreases
  • Size increases

Is there a better way to accomplisch this? I would prefer a non-server dependent method if that exists!

Answer

Not client-side, but...

exec("convert sample.pdf sample.jpeg")

Using ImageMagick, http://www.imagemagick.org/script/index.php / http://www.imagemagick.org/script/api.php#python

I think you would really struggle to do this client side. Server side is likely more powerful and can create better quality images faster and with more customisation.

Answer2

Try this out http://view.samurajdata.se/. You can use this tool to display PDF as image in HTML. Its open source and simple to use.

Please note this method uses PHP.

Answer3

convert is ok but I got best results with evince-thumbnailer. This performs about 8x faster for equal quality (YMMV).

On the down side, while evince-thumbnailer does not have the ghostscript dependency, it does introduce several other gnome related dependencies such as glib and gtk.

You can also use one of the poppler tools like pdftops. These are much faster than convert as well and have almost no dependencies, but quality won't be the same.

Here's some command lines for posterity.. I've run a fullpage 300dpi test and added the results. Note I'm using very big output files for mouse-over magnifying the thumbnail. You can just as well use smaller jpg's for faster results.

# This produces perfect png at 2480 pixels width.  # Takes averagely long (45sec).  # 2480 is A4 pixel width at 300dpi  evince-thumbnailer -s 2480 -l input.pdf /tmp/output.png  convert /tmp/output.png -quality 85% output.jpg    # This renders perfect jpg at 2400 pixels width.  # Takes extremely long. (6m)  # Quality is comparable  convert -density 300x300 input.pdf -resize 2400x -units PixelsPerInch \   -quality 85% output.jpg    # This uses poppler to convert to ps and then converts to jpg.  # Takes averagely long (45sec)  # Does not compete quality-wise.  pdftops -l 1 -r 2400 -paper A4 input.pdf /tmp/output.ps  convert /tmp/output.ps output.jpg

No comments:

Post a Comment