The Apache PDFBox® library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities. Apache PDFBox is published under the Apache License v2.0.
Apache PDFBox®库是一个用于处理PDF文档的开源Java工具。该项目允许创建新的PDF文档、操作现有文档以及从文档中提取内容。ApachePDFBox还包括几个命令行实用程序。Apache PDFBox是在Apache许可证V2.0下发布的
Extract Text:Extract Unicode text from PDF files.
Split & Merge:Split a single PDF into many files or merge multiple PDF files.
Fill Forms:Extract data from PDF forms or fill a PDF form.
Preflight:Validate PDF files against the PDF/A-1b standard.
Print:Print a PDF file using the standard Java printing API.
Save as Image:Save PDFs as image files, such as PNG or JPEG.
Create PDFs:Create a PDF from scratch, with embedded fonts and images.
Signing:Digitally sign PDF files.
To use the latest release you'll need to add the following dependency:
PDFBox and Java 8
Due to the change of the java color management module towards "LittleCMS", users can experience slow performance in color operations. A solution is to disable LittleCMS in favor of the old KCMS (Kodak Color Management System) by:
System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider")
PDFBox 2.0.4 introduced a new command line setting
PDFBox 2.0.4引入了新的命令行设置
which may improve the performance of rendering PDFs on some systems especially if there are a lot of images on a page.
The three PDFBox components are named pdfbox
, fontbox
and xmpbox
. The Maven groupId of all PDFBox components is org.apache.pdfbox.
这三个PDFBox组件被命名为PDFBox、fontbox和xmpbox。所有PDFBox组件的Maven groupId都是org.apache.pdfbox。
PDFBox has the following basic dependencies:
Commons Logging is a generic wrapper around different logging frameworks, so you'll either need to also use a logging library like log4j or let commons-logging fall back to the standard java.util.logging API included in the Java platform.
Commons Logging是一个围绕不同日志框架的通用包装器,因此您要么还需要使用log4j之类的日志库,要么让Commons日志回到java平台的标准java.util.logging.API
For font handling the fontbox component is needed.
To support XMP metadata the xmpbox component is needed.
To add the pdfbox, fontbox, xmpbox and commons-logging jars to your application, the easiest thing is to declare the Maven dependency shown below. This gives you the main pdfbox library directly and the other required jars as transitive dependencies.
PDFBox does not ship with all features enabled. Third party components are necessary to get full support for certain functionality.
PDF supports embedded image files, however support for some formats require third party libraries which are distributed under terms incompatible with the Apache 2.0 license:
PDF支持嵌入式图像文件,但对某些格式的支持需要第三方库,这些库是根据与Apache 2.0许可证不兼容的条款分发的:
These libraries are optional and will be loaded if present on the classpath, otherwise support for these image formats will be disabled and a warning will be logged when an unsupported image is encountered.
Maven dependencies for these components can be found in parent/pom.xml. Change the scope of the components if needed. Please make sure that any third party licenses are suitable for your project.
To include the JBIG2 library the following part can be included in your project pom.xml:
Encryption and Signing
Encrypting and sigining PDFs requires the bcprov, bcmail and bcpkix libraries from the Legion of the Bouncy Castle. These can be included in your Maven project using the following dependencies:
加密和签名PDF需要Legion of the Bouncy Castle 中的cprov, bcmail 和 bcpkix库。可以使用以下依赖项将其包含在Maven项目中:
Java Cryptography Extension (JCE)
256-bit AES encryption requires a JDK with "unlimited strength" cryptography, which requires extra files to be installed. For JDK 7, see Java Cryptography Extension (JCE). If these files are not installed, building PDFBox will throw an exception with the following message:
256位AES加密需要具有“无限强度”加密的JDK,这需要安装额外的文件。对于JDK 7,请参阅 Java Cryptography Extension (JCE).。如果未安装这些文件,构建PDFBox将引发异常,并显示以下消息:
JCE unlimited strength jurisdiction policy files are not installed