当前位置: 首页 > 知识库问答 >
问题:

VB.Net 将多个PDF合并为一个并导出

焦博实
2023-03-14

我必须将多个PDF合并为一个PDF。

我正在使用iText。sharp库和collect转换了代码并尝试使用它(从这里)实际的代码是C#,我将其转换为VB.NET。

 Private Function MergeFiles(ByVal sourceFiles As List(Of Byte())) As Byte()
    Dim mergedPdf As Byte() = Nothing
    Using ms As New MemoryStream()
        Using document As New Document()
            Using copy As New PdfCopy(document, ms)
                document.Open()
                For i As Integer = 0 To sourceFiles.Count - 1
                    Dim reader As New PdfReader(sourceFiles(i))
                    ' loop over the pages in that document
                    Dim n As Integer = reader.NumberOfPages
                    Dim page As Integer = 0
                    While page < n
                        page = page + 1
                        copy.AddPage(copy.GetImportedPage(reader, page))
                    End While
                Next
            End Using
        End Using
        mergedPdf = ms.ToArray()
    End Using
End Function

我现在收到以下错误:

已添加具有相同密钥的项目。

我做了一些调试,并将问题归结为以下几行:

copy.AddPage(copy.GetImportedPage(reader,
copy.AddPage(copy.GetImportedPage(reader, page)))

为什么会出现这种错误?

共有3个答案

姜鸿
2023-03-14

我意识到我参加派对已经很晚了,但是在阅读了@BrunoLowagie的评论之后,我想看看我是否可以自己使用他链接的示例章节中的示例来整理一些东西。这可能有些过分了,但是我将一些将多个PDF合并到一个文件中的代码放在一起,该文件发布在Code Review SE网站上(该帖子 VB.NET - PDF合并的通用类中的错误处理包含完整的类代码)。它现在只合并PDF文件,但我计划稍后为其他功能添加方法。

“master”方法(在链接帖子中的Class块的末尾,也发布在下面以供参考)处理PDF文件的实际合并,但多个重载为如何定义原始文件列表提供了许多选项。到目前为止,我已经包含了以下功能:

    < li >这些方法返回一个< code >系统。如果合并成功,则返回IO.FileInfo对象。 < li >提供一个< code >系统。IO.DirectoryInfo对象或< code >系统。String标识路径,它将收集该目录(包括子目录,如果指定的话)中的所有PDF文件进行合并。 < li >提供系统的< code >列表。字符串)或< code >列表(系统的。IO.FileInfo)指定要合并的pdf。 < li >在合并之前确定PDF应该如何排序(如果您使用< code>MergeAll方法之一来获取目录中的所有PDF文件,这尤其有用)。 < li >如果指定的输出PDF文件已经存在,您可以指定是否要覆盖它。(我正在考虑添加“能力”来自动调整输出PDF文件的名称,如果它已经存在)。 < li> Warning和< code>Error属性提供了一种在调用方法中获取反馈的方法,无论合并是否成功。

一旦代码就绪,就可以像这样使用它:

Dim PDFDir As New IO.DirectoryInfo("C:\Test Data\PDF\")
Dim ResultFile As IO.FileInfo = Nothing
Dim Merger As New PDFManipulator

ResultFile = Merger.MergeAll(PDFDir, "C:\Test Data\PDF\Merged.pdf", True, PDFManipulator.PDFMergeSortOrder.FileName, True)

这里是“大师”法。正如我所说的,这可能有点过头了(我还在对它进行一些调整),但我想尽我所能让它尽可能有效地工作。显然,它需要引用< code>itextsharp.dll才能访问该库的函数

在这篇文章中,我已经注释掉了对该类的< code>Error和< code>Warning属性的引用,以帮助减少任何混淆。

Public Function Merge(ByVal PDFFiles As List(Of System.IO.FileInfo), ByVal OutputFileName As String, ByVal OverwriteExistingPDF As Boolean, ByVal SortOrder As PDFMergeSortOrder) As System.IO.FileInfo
    Dim ResultFile As System.IO.FileInfo = Nothing
    Dim ContinueMerge As Boolean = True

    If OverwriteExistingPDF Then
        If System.IO.File.Exists(OutputFileName) Then
            Try
                System.IO.File.Delete(OutputFileName)
            Catch ex As Exception
                ContinueMerge = False

                'If Errors Is Nothing Then
                '    Errors = New List(Of String)
                'End If

                'Errors.Add("Could not delete existing output file.")

                Throw
            End Try
        End If
    End If

    If ContinueMerge Then
        Dim OutputPDF As iTextSharp.text.Document = Nothing
        Dim Copier As iTextSharp.text.pdf.PdfCopy = Nothing
        Dim PDFStream As System.IO.FileStream = Nothing
        Dim SortedList As New List(Of System.IO.FileInfo)

        Try
            Select Case SortOrder
                Case PDFMergeSortOrder.Original
                    SortedList = PDFFiles
                Case PDFMergeSortOrder.FileDate
                    SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.LastWriteTime).ToList
                Case PDFMergeSortOrder.FileName
                    SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.Name).ToList
                Case PDFMergeSortOrder.FileNameWithDirectory
                    SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.FullName).ToList
            End Select

            If Not IO.Directory.Exists(New IO.FileInfo(OutputFileName).DirectoryName) Then
                Try
                    IO.Directory.CreateDirectory(New IO.FileInfo(OutputFileName).DirectoryName)
                Catch ex As Exception
                    ContinueMerge = False

                    'If Errors Is Nothing Then
                    '    Errors = New List(Of String)
                    'End If

                    'Errors.Add("Could not create output directory.")

                    Throw
                End Try
            End If

            If ContinueMerge Then
                OutputPDF = New iTextSharp.text.Document
                PDFStream = New System.IO.FileStream(OutputFileName, System.IO.FileMode.OpenOrCreate)
                Copier = New iTextSharp.text.pdf.PdfCopy(OutputPDF, PDFStream)

                OutputPDF.Open()

                For Each PDF As System.IO.FileInfo In SortedList
                    If ContinueMerge Then
                        Dim InputReader As iTextSharp.text.pdf.PdfReader = Nothing

                        Try
                            InputReader = New iTextSharp.text.pdf.PdfReader(PDF.FullName)

                            For page As Integer = 1 To InputReader.NumberOfPages
                                Copier.AddPage(Copier.GetImportedPage(InputReader, page))
                            Next page

                            If InputReader.IsRebuilt Then
                                'If Warnings Is Nothing Then
                                '    Warnings = New List(Of String)
                                'End If

                                'Warnings.Add("Damaged PDF: " & PDF.FullName & " repaired and successfully merged into output file.")
                            End If
                        Catch InvalidEx As iTextSharp.text.exceptions.InvalidPdfException
                            'Skip this file
                            'If Errors Is Nothing Then
                            '    Errors = New List(Of String)
                            'End If

                            'Errors.Add("Invalid PDF: " & PDF.FullName & " not merged into output file.")
                        Catch FormatEx As iTextSharp.text.pdf.BadPdfFormatException
                            'Skip this file
                            'If Errors Is Nothing Then
                            '    Errors = New List(Of String)
                            'End If

                            'Errors.Add("Bad PDF Format: " & PDF.FullName & " not merged into output file.")
                        Catch PassworddEx As iTextSharp.text.exceptions.BadPasswordException
                            'Skip this file
                            'If Errors Is Nothing Then
                            '    Errors = New List(Of String)
                            'End If

                            'Errors.Add("Password-protected PDF: " & PDF.FullName & " not merged into output file.")
                        Catch OtherEx As Exception
                            ContinueMerge = False
                        Finally
                            If Not InputReader Is Nothing Then
                                InputReader.Close()
                                InputReader.Dispose()
                            End If
                        End Try
                    End If
                Next PDF
            End If
        Catch ex As iTextSharp.text.pdf.PdfException
            ResultFile = Nothing
            ContinueMerge = False

            'If Errors Is Nothing Then
            '    Errors = New List(Of String)
            'End If

            'Errors.Add("iTextSharp Error: " & ex.Message)

            If System.IO.File.Exists(OutputFileName) Then
                If Not OutputPDF Is Nothing Then
                    OutputPDF.Close()
                    OutputPDF.Dispose()
                End If

                If Not PDFStream Is Nothing Then
                    PDFStream.Close()
                    PDFStream.Dispose()
                End If

                If Not Copier Is Nothing Then
                    Copier.Close()
                    Copier.Dispose()
                End If

                System.IO.File.Delete(OutputFileName)
            End If

            Throw
        Catch other As Exception
            ResultFile = Nothing
            ContinueMerge = False

            'If Errors Is Nothing Then
            '    Errors = New List(Of String)
            'End If

            'Errors.Add("General Error: " & other.Message)

            If System.IO.File.Exists(OutputFileName) Then
                If Not OutputPDF Is Nothing Then
                    OutputPDF.Close()
                    OutputPDF.Dispose()
                End If

                If Not PDFStream Is Nothing Then
                    PDFStream.Close()
                    PDFStream.Dispose()
                End If

                If Not Copier Is Nothing Then
                    Copier.Close()
                    Copier.Dispose()
                End If

                System.IO.File.Delete(OutputFileName)
            End If

            Throw
        Finally
            If Not OutputPDF Is Nothing Then
                OutputPDF.Close()
                OutputPDF.Dispose()
            End If

            If Not PDFStream Is Nothing Then
                PDFStream.Close()
                PDFStream.Dispose()
            End If

            If Not Copier Is Nothing Then
                Copier.Close()
                Copier.Dispose()
            End If

            If System.IO.File.Exists(OutputFileName) Then
                If ContinueMerge Then
                    ResultFile = New System.IO.FileInfo(OutputFileName)

                    If ResultFile.Length <= 0 Then
                        ResultFile = Nothing

                        Try
                            System.IO.File.Delete(OutputFileName)
                        Catch ex As Exception
                            Throw
                        End Try
                    End If
                Else
                    ResultFile = Nothing

                    Try
                        System.IO.File.Delete(OutputFileName)
                    Catch ex As Exception
                        Throw
                    End Try
                End If
            Else
                ResultFile = Nothing
            End If
        End Try
    End If

    Return ResultFile
End Function
郎飞龙
2023-03-14

标记为正确的代码不会关闭所有的文件流,因此文件在应用程序中保持打开,您将无法删除项目中未使用的pdf

这是一个更好的解决方案:

Public Sub MergePDFFiles(ByVal outPutPDF As String) 

    Dim StartPath As String = FileArray(0) ' this is a List Array declared Globally
    Dim document = New Document()
    Dim outFile = Path.Combine(outPutPDF)' The outPutPDF varable is passed from another sub this is the output path
    Dim writer = New PdfCopy(document, New FileStream(outFile, FileMode.Create))

    Try

        document.Open()
        For Each fileName As String In FileArray

            Dim reader = New PdfReader(Path.Combine(StartPath, fileName))

            For i As Integer = 1 To reader.NumberOfPages

                Dim page = writer.GetImportedPage(reader, i)
                writer.AddPage(page)

            Next i

            reader.Close()

        Next

        writer.Close()
        document.Close()

    Catch ex As Exception
        'catch a Exception if needed

    Finally

        writer.Close()
        document.Close()

    End Try


End Sub
金宣
2023-03-14

我有一个控制台来监控指定文件夹中的单个文件夹,然后需要将该文件夹中的所有pdf合并到一个pdf中。我将文件路径数组作为字符串传递,并输出我想要的文件。

这是我使用的功能。

Public Shared Function MergePdfFiles(ByVal pdfFiles() As String, ByVal outputPath As String) As Boolean
    Dim result As Boolean = False
    Dim pdfCount As Integer = 0     'total input pdf file count
    Dim f As Integer = 0    'pointer to current input pdf file
    Dim fileName As String
    Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
    Dim pageCount As Integer = 0
    Dim pdfDoc As iTextSharp.text.Document = Nothing    'the output pdf document
    Dim writer As PdfWriter = Nothing
    Dim cb As PdfContentByte = Nothing

    Dim page As PdfImportedPage = Nothing
    Dim rotation As Integer = 0

    Try
        pdfCount = pdfFiles.Length
        If pdfCount > 1 Then
            'Open the 1st item in the array PDFFiles
            fileName = pdfFiles(f)
            reader = New iTextSharp.text.pdf.PdfReader(fileName)
            'Get page count
            pageCount = reader.NumberOfPages

            pdfDoc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1), 18, 18, 18, 18)

            writer = PdfWriter.GetInstance(pdfDoc, New FileStream(outputPath, FileMode.OpenOrCreate))


            With pdfDoc
                .Open()
            End With
            'Instantiate a PdfContentByte object
            cb = writer.DirectContent
            'Now loop thru the input pdfs
            While f < pdfCount
                'Declare a page counter variable
                Dim i As Integer = 0
                'Loop thru the current input pdf's pages starting at page 1
                While i < pageCount
                    i += 1
                    'Get the input page size
                    pdfDoc.SetPageSize(reader.GetPageSizeWithRotation(i))
                    'Create a new page on the output document
                    pdfDoc.NewPage()
                    'If it is the 1st page, we add bookmarks to the page
                    'Now we get the imported page
                    page = writer.GetImportedPage(reader, i)
                    'Read the imported page's rotation
                    rotation = reader.GetPageRotation(i)
                    'Then add the imported page to the PdfContentByte object as a template based on the page's rotation
                    If rotation = 90 Then
                        cb.AddTemplate(page, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(i).Height)
                    ElseIf rotation = 270 Then
                        cb.AddTemplate(page, 0, 1.0F, -1.0F, 0, reader.GetPageSizeWithRotation(i).Width + 60, -30)
                    Else
                        cb.AddTemplate(page, 1.0F, 0, 0, 1.0F, 0, 0)
                    End If
                End While
                'Increment f and read the next input pdf file
                f += 1
                If f < pdfCount Then
                    fileName = pdfFiles(f)
                    reader = New iTextSharp.text.pdf.PdfReader(fileName)
                    pageCount = reader.NumberOfPages
                End If
            End While
            'When all done, we close the document so that the pdfwriter object can write it to the output file
            pdfDoc.Close()
            result = True
        End If
    Catch ex As Exception
        Return False
    End Try
    Return result
End Function
 类似资料:
  • 问题内容: 如何将多个PDF文件合并/转换为一个大PDF文件? 我尝试了以下操作,但是目标文件的内容不符合预期: 我需要一个非常简单/基本的命令行(CLI)解决方案。最好的办法是,如果我可以将合并/转换的输出直接传送到管道中(就像我之前在这里提出的问题中最初尝试的那样:Linux管道(convert->pdf2ps-> lp)。 问题答案: 抱歉,我设法使用Google自己找到了答案,还有些运气:

  • 我在同一个excel表中有3000个类似于这些表的表: PS:我只需要csv表中指定的信息

  • 但问题是.value导致任务在我的任务运行之前就被评估,因此它们无法获得新的目标设置(如另一个问题中所述:如何从我的SBT任务中调用另一个任务?)

  • 我正在尝试创建Java程序,它可以读取多个pdf文件,并将它们合并成一个单一的pdf文件。然后打印PDF,但在打印时,我需要将多个PDF页面合并在一个页面中并打印。即使它是一个新的pdf创建,那对我来说是好的。我需要一些开源的Java pdf操纵库来处理这个。我知道一个解决方案是在打印时,选择多个打印选项,将多页打印到一张纸上。但我可以访问的打印机没有这样的功能。有谁能为这个问题提出一些解决方案,

  • 在这里,我想合并两个单独的A4 PDF到A3 PDF。A4 PDF页面应适合并排查看的A3 2-ups。 我现在还没有尝试过任何代码,但在我想知道这是否可能之前? 注意:A4 PDF可以有“N”个页数,而不是单页PDF。 以下是图形图像示例:

  • 问题内容: 如何将这两个JToken合并为一个JToken。听起来应该很简单,但无法解决。 谢谢您的帮助! 到目前为止,这是我尝试过的: 我首先将第一个对象分配给变量,然后尝试将其连接到第二个变量。我有一个循环,可以带回具有三个字段的多个页面。最终目标是抓取每个页面并创建一个包含所有页面的大J。 像这样的东西: 问题答案: 您可以用来将一个合并到另一个。请注意,可以控制数组的合并方式。从Enume