当前位置: 首页 > 知识库问答 >
问题:

iTextSharp获取对图形标记的引用

齐坚成
2023-03-14

我已经研究了几个小时如何做到这一点,但遇到了砖墙。我有一个PDF文件,其中一个对象是一个北箭头。它是一个简单的线条图形(我相信它们在Acrobat中被称为图形标记),将指示哪种方式是“向上”。我想读一下那条线图形,确定它的旋转。我采取的第一步是看看是否可以用以下代码枚举PDF的内容:

Imports it = iTextSharp.text
Imports ip = iTextSharp.text.pdf

Dim pdfRdr As New ip.PdfReader("C:\city.pdf")
Dim page As ip.PdfDictionary = pdfRdr.GetPageN(1)
Dim objectReference As ip.PdfIndirectReference = CType(page.Get(ip.PdfName.CONTENTS), ip.PdfIndirectReference)
Dim stream As ip.PRStream = CType(ip.PdfReader.GetPdfObject(objectReference), ip.PRStream)
Dim streamBytes() As Byte = ip.PdfReader.GetStreamBytes(stream)
Dim tokenizer As New ip.PRTokeniser(New ip.RandomAccessFileOrArray(streamBytes))

'Loop through each PDf token
While tokenizer.NextToken
     Debug.Print("token of type={0} and value={1}", tokenizer.TokenType.ToString, tokenizer.StringValue)
End While
token of type=OTHER and value=q
token of type=NUMBER and value=0.86275
token of type=NUMBER and value=0
token of type=NUMBER and value=0
token of type=NUMBER and value=0.86275
token of type=NUMBER and value=54
token of type=NUMBER and value=30
token of type=OTHER and value=cm
token of type=NAME and value=Fm0
token of type=OTHER and value=Do
token of type=OTHER and value=Q
token of type=OTHER and value=q
token of type=NUMBER and value=1
token of type=NUMBER and value=0
token of type=NUMBER and value=0
token of type=NUMBER and value=1
token of type=NUMBER and value=54
token of type=NUMBER and value=18
token of type=OTHER and value=cm
token of type=NAME and value=Fm1
token of type=OTHER and value=Do
token of type=OTHER and value=Q

我是在走正确的道路,还是有不同的方法来获取对图形标记的引用?

共有1个答案

楚丰羽
2023-03-14

与最初的印象相反,北箭头不在PDF的注释中,而是常规页面内容的一部分。(@乔恩在最初的印象下创造了他的答案。)

在OP共享的PDF中,箭头是直接页面内容的一部分。另一方面,在OP共享的Adobe Acrobat屏幕截图中,箭头似乎是在表单XObject中(该表单将从直接的页面内容中引用)。

下面的方法应该检索这两种情况下的矢量图形指令。

Public Class VectorParser
    Implements IExtRenderListener

    Public Sub ModifyPath(renderInfo As PathConstructionRenderInfo) Implements IExtRenderListener.ModifyPath
        pathInfos.Add(renderInfo)
    End Sub

    Public Function RenderPath(renderInfo As PathPaintingRenderInfo) As parser.Path Implements IExtRenderListener.RenderPath
        Dim GraphicsState As GraphicsState = getGraphicsState(renderInfo)
        Dim ctm As Matrix = GraphicsState.GetCtm()

        If (Not (renderInfo.Operation And PathPaintingRenderInfo.FILL) = 0) Then
            Console.Write("FILL ({0}) ", ToString(GraphicsState.FillColor))
            If (Not (renderInfo.Operation And PathPaintingRenderInfo.STROKE) = 0) Then
                Console.Write("and ")
            End If
        End If

        If (Not (renderInfo.Operation And PathPaintingRenderInfo.STROKE) = 0) Then
            Console.Write("STROKE ({0}) ", ToString(GraphicsState.StrokeColor))
        End If

        Console.Write("the path ")

        For Each pathConstructionRenderInfo In pathInfos
            Select Case pathConstructionRenderInfo.Operation
                Case PathConstructionRenderInfo.MOVETO
                    Console.Write("move to {0} ", ToString(transform(ctm, pathConstructionRenderInfo.SegmentData)))
                Case PathConstructionRenderInfo.CLOSE
                    Console.Write("close {0} ", ToString(transform(ctm, pathConstructionRenderInfo.SegmentData)))
                Case PathConstructionRenderInfo.CURVE_123
                    Console.Write("curve123 {0} ", ToString(transform(ctm, pathConstructionRenderInfo.SegmentData)))
                Case PathConstructionRenderInfo.CURVE_13
                    Console.Write("curve13 {0} ", ToString(transform(ctm, pathConstructionRenderInfo.SegmentData)))
                Case PathConstructionRenderInfo.CURVE_23
                    Console.Write("curve23 {0} ", ToString(transform(ctm, pathConstructionRenderInfo.SegmentData)))
                Case PathConstructionRenderInfo.LINETO
                    Console.Write("line to {0} ", ToString(transform(ctm, pathConstructionRenderInfo.SegmentData)))
                Case PathConstructionRenderInfo.RECT
                    Console.Write("rectangle {0} ", ToString(transform(ctm, expandRectangleCoordinates(pathConstructionRenderInfo.SegmentData))))
            End Select
        Next

        Console.WriteLine()

        pathInfos.Clear()
        Return Nothing
    End Function

    Public Sub ClipPath(rule As Integer) Implements IExtRenderListener.ClipPath
    End Sub

    Public Sub BeginTextBlock() Implements IRenderListener.BeginTextBlock
    End Sub

    Public Sub RenderText(renderInfo As TextRenderInfo) Implements IRenderListener.RenderText
    End Sub

    Public Sub EndTextBlock() Implements IRenderListener.EndTextBlock
    End Sub

    Public Sub RenderImage(renderInfo As ImageRenderInfo) Implements IRenderListener.RenderImage
    End Sub

    Function expandRectangleCoordinates(rectangle As IList(Of Single)) As List(Of Single)
        If rectangle.Count < 4 Then
            Return New List(Of Single)
        End If

        Return New List(Of Single)() From
        {
            rectangle(0), rectangle(1),
            rectangle(0) + rectangle(2), rectangle(1),
            rectangle(0) + rectangle(2), rectangle(1) + rectangle(3),
            rectangle(0), rectangle(1) + rectangle(3)
        }
    End Function

    Function transform(ctm As Matrix, coordinates As IList(Of Single)) As List(Of Single)
        Dim result As List(Of Single) = New List(Of Single)
        If Not coordinates Is Nothing Then
            For i = 0 To coordinates.Count - 1 Step 2
                Dim vector As Vector = New Vector(coordinates(i), coordinates(i + 1), 1)
                vector = vector.Cross(ctm)
                result.Add(vector(Vector.I1))
                result.Add(vector(Vector.I2))
            Next
        End If
        Return result
    End Function

    Public Function ToString(coordinates As IList(Of Single)) As String
        Dim result As StringBuilder = New StringBuilder()
        result.Append("[ ")
        For i = 0 To coordinates.Count - 1
            result.Append(coordinates(i))
            result.Append(" ")
        Next
        result.Append("]")
        Return result.ToString()
    End Function

    Public Function ToString(baseColor As BaseColor) As String
        If (baseColor Is Nothing) Then
            Return "DEFAULT"
        End If
        Return String.Format("{0},{1},{2}", baseColor.R, baseColor.G, baseColor.B)
    End Function

    Function getGraphicsState(renderInfo As PathPaintingRenderInfo) As GraphicsState
        Dim gsField As Reflection.FieldInfo = GetType(PathPaintingRenderInfo).GetField("gs", Reflection.BindingFlags.NonPublic Or Reflection.BindingFlags.Instance)
        Return CType(gsField.GetValue(renderInfo), GraphicsState)
    End Function

    Dim pathInfos As List(Of PathConstructionRenderInfo) = New List(Of PathConstructionRenderInfo)
End Class
Using pdfReader As New PdfReader("test.pdf")
    Dim extRenderListener As IExtRenderListener = New VectorParser

    For page = 1 To pdfReader.NumberOfPages
        Console.Write(vbCrLf + "Page {0}" + vbCrLf + "====" + vbCrLf, page)
        Dim parser As PdfReaderContentParser = New PdfReaderContentParser(pdfReader)
        parser.ProcessContent(page, extRenderListener)
    Next
End Using
Page 1
====
STROKE (0,0,255) the path move to [ 277,359 434,2797 ] line to [ 311,5242 434,2797 ] 
STROKE (0,0,255) the path move to [ 277,3591 434,2797 ] line to [ 315,0443 424,1336 ] 
STROKE (0,0,255) the path move to [ 304,2772 425,376 ] line to [ 304,4842 426,6183 ] 
STROKE (0,0,255) the path move to [ 304,6913 426,2042 ] line to [ 310,075 425,376 ] 
STROKE (0,0,255) the path move to [ 304,6913 426,8254 ] line to [ 307,5902 425,9972 ] 
FILL (0,0,255) the path move to [ 303,656 425,3759 ] line to [ 303,656 425,3759 ] line to [ 306,1407 425,1689 ] line to [ 306,1407 425,1689 ] 
STROKE (0,0,255) the path move to [ 303,656 425,376 ] line to [ 303,656 425,376 ] line to [ 306,1407 425,1689 ] line to [ 306,1407 425,1689 ] close [ ] 
FILL (0,0,255) the path move to [ 306,969 424,9618 ] line to [ 306,969 424,9618 ] line to [ 309,4538 424,7548 ] line to [ 309,4538 424,7548 ] 
STROKE (0,0,255) the path move to [ 306,969 424,9619 ] line to [ 306,969 424,9619 ] line to [ 309,4538 424,7548 ] line to [ 309,4538 424,7548 ] close [ ] 
FILL (0,0,255) the path move to [ 309,8679 424,9618 ] line to [ 309,8679 424,9618 ] line to [ 312,3527 424,5477 ] line to [ 312,3527 424,5477 ] 
STROKE (0,0,255) the path move to [ 309,868 424,9619 ] line to [ 309,868 424,9619 ] line to [ 312,3527 424,5477 ] line to [ 312,3527 424,5477 ] close [ ] 
STROKE (0,0,255) the path move to [ 313,1809 424,3407 ] line to [ 314,8374 424,1336 ] 
STROKE (0,0,255) the path move to [ 304,2772 425,7901 ] line to [ 309,8679 424,9619 ] line to [ 312,9738 424,7548 ] 
STROKE (0,0,255) the path move to [ 304,2772 425,9972 ] line to [ 309,8679 425,1689 ] line to [ 311,5244 424,9619 ] 
STROKE (0,0,255) the path move to [ 304,6914 426,8254 ] line to [ 315,0445 424,1336 ] 
STROKE (0,0,255) the path move to [ 311,7315 435,7292 ] line to [ 311,7315 432,8303 ] 
STROKE (0,0,255) the path move to [ 321,2564 434,2797 ] line to [ 315,4587 434,2797 ] 
STROKE (0,0,255) the path move to [ 315,4586 434,2797 ] line to [ 311,7315 434,2797 ] 
STROKE (0,0,255) the path move to [ 311,7315 434,6938 ] line to [ 317,7363 434,0727 ] line to [ 311,7315 433,6585 ] 
STROKE (0,0,255) the path move to [ 311,7315 434,4868 ] line to [ 314,8374 434,2797 ] line to [ 311,7315 434,2797 ] 
STROKE (0,0,255) the path move to [ 310,6963 436,1433 ] line to [ 317,3222 434,9009 ] line to [ 322,2917 434,2797 ] line to [ 317,3222 433,6585 ] line to [ 310,6963 432,6232 ] 
STROKE (0,0,255) the path move to [ 311,7315 435,5221 ] line to [ 317,3222 434,6938 ] line to [ 321,0493 434,2797 ] line to [ 317,3222 433,8656 ] line to [ 311,7315 433,0374 ] 
STROKE (0,0,255) the path move to [ 311,7315 435,108 ] line to [ 317,3222 434,4868 ] line to [ 319,3928 434,2797 ] line to [ 317,3222 434,2797 ] line to [ 311,7315 433,4515 ]

使用iText7实现这一点同样简单。

顺便说一句:不幸的是,画箭头的指示没有特别标记;如果在同一页面上有其他矢量图形,则必须根据某些特定的标准过滤解析器返回的结果,例如颜色(在本例中为纯RGB蓝色)或近似的坐标范围(例如仅在给定的x和y坐标范围内)。

 类似资料:
  • 我是一个新的传单,并试图实现不同CSS样式的一组标记。 因此,我意识到,在向映射添加标记后,我可以通过调用标记上的getElement()来访问不同的CSS属性,例如:

  • 我知道周围也有类似的问题,但我无法找到一个可靠的答案,这是我的问题:有没有什么方法可以在没有谷歌地图引用的情况下将标记存储在ArrayList(或任何其他存储)中,然后简单地将它们添加到我的地图中? 背景:我有一个应用程序,目前有大约3500个标记。每个标记还有一个与之相关联的数据(布尔数组存储每个标记的数据,用于根据用户交互使它们可见/不可见)。目前,我使用扩展AsyncTask的类来获取这些标

  • 我正在尝试使用Microsoft Graph API访问用户所在的组。我面临一个问题,因为我认为我的权限设置正确,然而,当我登录到应用程序时,我得到的信息是: 有人遇到过这个问题吗? 知道我该怎么纠正吗?

  • 我试图显示带有旋转x轴标签的图表,但该图表未显示。 此时,我可以看到图像,但在设置xticklabel后,我不再看到图像,只看到对象引用。(我会发布图片,但我没有足够的声誉:() 这里发布了一个类似的问题:在seaborn FactoryPlot中旋转标签文本,但解决方案不起作用。

  • 我正在尝试实现chartJS条形图,并使用图例过滤条形图。我想将标签列表设置为空,因为这样可以清楚地删除条。我正在寻找一种在X轴上设置标记的方法,因为现在文本中的标记为空。 JSIDLE: https://jsfiddle.net/m1eorjwv/1/ 非常感谢,阿龙

  • 问题内容: 我想要: 在pylab中返回当前图形列表的魔术函数是什么? 网络搜索没有帮助… 问题答案: 编辑:正如MattiPastell的解决方案所示,还有一个更好的方法:使用。