windows下Python实现将pdf文件转化为png格式图片的方法

2025-04-09 07:15:59

本文实例讲述了windows下Python实现将pdf文件转化为png格式图片的方法。分享给大家供大家参考，具体如下：

最近工作中需要把pdf文件转化为图片，想用Python来实现，于是在网上找啊找啊找啊找，找了半天，倒是找到一些代码。

1、第一个找到的代码，我试了一下好像是反了，只能实现把图片转为pdf，而不能把pdf转为图片。。。

参考链接：https://zhidao.baidu.com/question/745221795058982452.html

代码如下：

#!/usr/bin/env python
import os
import sys
from reportlab.lib.pagesizes import A4, landscape
from reportlab.pdfgen import canvas
f = sys.argv[1]
filename = ''.join(f.split('/')[-1:])[:-4]
f_jpg = filename+'.jpg'
print f_jpg
def conpdf(f_jpg):
 f_pdf = filename+'.pdf'
 (w, h) = landscape(A4)
 c = canvas.Canvas(f_pdf, pagesize = landscape(A4))
 c.drawImage(f, 0, 0, w, h)
 c.save()
 print "okkkkkkkk."
conpdf(f_jpg)

2、第二个是文章写的比较详细，可惜的是linux下的代码，所以仍然没用。

3、第三个文章指出有一个库PythonMagick可以实现这个功能，需要下载一个库 PythonMagick-0.9.10-cp27-none-win_amd64.whl 这个是64位的。

这里不得不说自己又犯了一个错误，因为自己从python官网上下载了一个python 2.7,以为是64位的版本，实际上是32位的版本，所以导致python的版本（32位）和下载的PythonMagick的版本（64位）不一致，弄到晚上12点多，总算了发现了这个问题。。。

4、然后，接下来继续用搜索引擎搜，找到很多stackoverflow的问题帖子，发现了2个代码，不过要先下载PyPDF2以及ghostscript模块。

先通过pip来安装 PyPDF2、PythonMagick、ghostscript 模块。

C:\Users\Administrator>pip install PyPDF2
Collecting PyPDF2
 Using cached PyPDF2-1.25.1.tar.gz
Installing collected packages: PyPDF2
 Running setup.py install for PyPDF2
Successfully installed PyPDF2-1.25.1
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
C:\Users\Administrator>pip install C:\PythonMagick-0.9.10-cp27-none-win_amd64.whl
Processing c:\pythonmagick-0.9.10-cp27-none-win_amd64.whl
Installing collected packages: PythonMagick
Successfully installed PythonMagick-0.9.10
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
C:\Users\Administrator>pip install ghostscript
Collecting ghostscript
 Downloading ghostscript-0.4.1.tar.bz2
Requirement already satisfied (use --upgrade to upgrade): setuptools in c:\python27\lib\site-packages (from ghostscript)
Installing collected packages: ghostscript
 Running setup.py install for ghostscript
Successfully installed ghostscript-0.4.1
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

下面是代码

代码1：

import os
import ghostscript
from PyPDF2 import PdfFileReader, PdfFileWriter
from tempfile import NamedTemporaryFile
from PythonMagick import Image
reader = PdfFileReader(open("C:/deep.pdf", "rb"))
for page_num in xrange(reader.getNumPages()):
 writer = PdfFileWriter()
 writer.addPage(reader.getPage(page_num))
 temp = NamedTemporaryFile(prefix=str(page_num), suffix=".pdf", delete=False)
 writer.write(temp)
 print temp.name
 tempname = temp.name
 temp.close()
 im = Image(tempname)
 #im.density("3000") # DPI, for better quality
 #im.read(tempname)
 im.write("some_%d.png" % (page_num))
 os.remove(tempname)

代码2：

import sys
import PyPDF2
import PythonMagick
import ghostscript
pdffilename = "C:\deep.pdf"
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, "rb"))
print '1'
npage = pdf_im.getNumPages()
print('Converting %d pages.' % npage)
for p in range(npage):
 im = PythonMagick.Image()
 im.density('300')
 im.read(pdffilename + '[' + str(p) +']')
 im.write('file_out-' + str(p)+ '.png')
 #print pdffilename + '[' + str(p) +']','file_out-' + str(p)+ '.png'

然后执行时都报错了，这个是代码2 的报错信息：

Traceback (most recent call last):
 File "C:\c.py", line 15, in <module>
 im.read(pdffilename + '[' + str(p) +']')
RuntimeError: pythonw.exe: PostscriptDelegateFailed `C:\DEEP.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/713

总是在上面的 im.read(pdffilename + '[' + str(p) +']') 这一行报错。

于是，根据报错的信息在网上查，但是没查到什么有用的信息，但是感觉应该和GhostScript有关，于是在网上去查安装包，找到一个在github上的下载连接，但是点进去的时候显示无法下载。

最后，在csdn的下载中找到了这个文件：GhostScript_Windows_9.15_win32_win64，安装了64位版本，之后，再次运行上面的代码，都能用了。

不过代码2需要做如下修改，不然还是会报 No such file or directory @ error/pdf.c/ReadPDFImage/713 错误：

#代码2
import sys
import PyPDF2
import PythonMagick
import ghostscript
pdffilename = "C:\deep.pdf"
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, "rb"))
print '1'
npage = pdf_im.getNumPages()
print('Converting %d pages.' % npage)
for p in range(npage):
 im = PythonMagick.Image(pdffilename + '[' + str(p) +']')
 im.density('300')
 #im.read(pdffilename + '[' + str(p) +']')
 im.write('file_out-' + str(p)+ '.png')
 #print pdffilename + '[' + str(p) +']','file_out-' + str(p)+ '.png'

这次有个很深刻的体会，就是解决这个问题过程中，大部分时间都是用在查资料、验证资格资料是否有用上了，搜索资料的能力很重要。

而在实际搜索资料的过程中，国内关于PythonMagick的文章太少了，搜索出来的大部分有帮助的文章都是国外的，但是这些国外的帖子文章，也没有解决我的问题或者是给出有用的线索，最后还是通过自己的思考，解决了问题。

更多关于Python相关内容感兴趣的读者可查看本站专题：《Python图片操作技巧总结》、《Python数据结构与算法教程》、《Python Socket编程技巧总结》、《Python函数使用技巧总结》、《Python字符串操作技巧汇总》、《Python入门与进阶经典教程》及《Python文件与目录操作技巧汇总》

希望本文所述对大家Python程序设计有所帮助。

利用Python的Django框架生成PDF文件的教程

便携文档格式 (PDF) 是由 Adobe 开发的格式,主要用于呈现可打印的文档,其中包含有 pixel-perfect 格式,嵌入字体以及2D矢量图像. You can think of a PDF document as the digital equivalent of a printed document; indeed, PDFs are often used in distributing documents for the purpose of printing them. 可以方
Python生成pdf文件的方法

本文实例演示了Python生成pdf文件的方法,是比较实用的功能,主要包含2个文件.具体实现方法如下: pdf.py文件如下: #!/usr/bin/python from reportlab.pdfgen import canvas def hello(): c = canvas.Canvas("helloworld.pdf") c.drawString(100,100,"Hello,World") c.showPage() c.save() hello() di
python将html转成PDF的实现代码(包含中文)

前提: 安装xhtml2pdf https://pypi.python.org/pypi/xhtml2pdf/下载字体:微软雅黑:给个地址:http://www.jb51.net/fonts/8481.html 待转换的文件:1.htm 复制代码代码如下: <meta charset="utf8"/><style type='text/css'>@font-face { font-family: "code2000";
Python中使用PyQt把网页转换成PDF操作代码实例

代码很简单,功能也很简单 =w= webpage2pdf #!/usr/bin/env python3 import sys try: from PyQt4 import QtWebKit from PyQt4.QtCore import QUrl from PyQt4.QtGui import QApplication, QPrinter except ImportError: from PySide import QtWebKit from PySide.QtCore import QUrl
python爬虫实现教程转换成 PDF 电子书

写爬虫似乎没有比用 Python 更合适了,Python 社区提供的爬虫工具多得让你眼花缭乱,各种拿来就可以直接用的 library 分分钟就可以写出一个爬虫出来,今天就琢磨着写一个爬虫,将廖雪峰的 Python 教程爬下来做成 PDF 电子书方便大家离线阅读. 开始写爬虫前,我们先来分析一下该网站1的页面结构,网页的左侧是教程的目录大纲,每个 URL 对应到右边的一篇文章,右侧上方是文章的标题,中间是文章的正文部分,正文内容是我们关心的重点,我们要爬的数据就是所有网页的正文部分,下方是用户的
基于Python实现对PDF文件的OCR识别

最近在做一个项目的时候,需要将PDF文件作为输入,从中输出文本,然后将文本存入数据库中.为此,我找寻了很久的解决方案,最终才确定使用tesseract.所以不要浪费时间了,我们开始吧. 1.安装tesseract 在不同的系统中安装tesseract非常容易.为了简便,我们以Ubuntu为例. 在Ubuntu中你仅仅需要运行以下命令: 这将会安装支持3种不同语言的tesseract. 2.安装PyOCR 现在我们还需要安装tesseract的Python接口.幸运的是,有许多出色的Python接
用python 制作图片转pdf工具

最近因为想要看漫画,无奈下载的漫画是jpg的格式,网上的转换器还没一个好用的,于是乎就打算用python自己DIY一下: 这里主要用了reportlab.开始打算随便写几行,结果为若干坑纠结了挺久,于是乎就想想干脆把代码写好点吧. 实现了以下的几项功能: 将当前文件夹下的图片保存到一个pdf中,支持选择pdf大小等如果有需要可以遍历它下面的所有文件夹简单的来说完全满足我将漫画转成pdf格式的需求了. 碰到了一些问题,这里记录下: 一.中文路径: 这个实在是略蛋疼,总之就是尽量都decode一
python使用reportlab实现图片转换成pdf的方法

本文实例讲述了python使用reportlab实现图片转换成pdf的方法.分享给大家供大家参考.具体实现方法如下: #!/usr/bin/env python import os import sys from reportlab.lib.pagesizes import A4, landscape from reportlab.pdfgen import canvas f = sys.argv[1] filename = ''.join(f.split('/')[-1:])[:-4] f_j
Python实现批量把SVG格式转成png、pdf格式的代码分享

需要提前安装cairosvg模块,下载地址http://cairosvg.org/download/ Code: #! encoding:UTF-8 import cairosvg import os loop = True while loop: svgDir = raw_input("请输入SVG文件目录") if os.path.exists(svgDir) and os.path.isdir(svgDir): loop = False
Python实现将DOC文档转换为PDF的方法

本文实例讲述了Python实现将DOC文档转换为PDF的方法.分享给大家供大家参考.具体实现方法如下: import sys, os from win32com.client import Dispatch, constants, gencache def usage(): sys.stderr.write ("doc2pdf.py input [output]") sys.exit(2) def doc2pdf(input, output): w = Dispatch("W
Python爬取读者并制作成PDF

学了下beautifulsoup后,做个个网络爬虫,爬取读者杂志并用reportlab制作成pdf.. crawler.py 复制代码代码如下: #!/usr/bin/env python #coding=utf-8 """ Author: Anemone Filename: getmain.py Last modified: 2015-02-19 16:47 E-mail: anemone@82

windows下Python实现将pdf文件转化为png格式图片的方法

相关推荐

随机推荐