当前位置: 首页 > 知识库问答 >
问题:

绘制具有可变子节点数的树?

宋志学
2023-03-14

我希望生成一个可视化xml文件结构的图形。

我创建了一个节点列表来表示xml文件
每个节点包含3个字符串:xml标记、属性和内容。

xml 文件如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<entry db="genbank">
   <data id="AC116785" length="132912" molecule="DNA" data_class="linear" division="HTG" date="08-JUL-2002" />
   <definition>
      <description>Mus musculus clone RP24-146B1, WORKING DRAFT SEQUENCE, 10 ordered pieces.</description>
   </definition>
   <accession>AC116785</accession>
   <version>
      <version_number>AC116785.3</version_number>
      <gi>21703640</gi>
   </version>
   <keywords>
      <keyword>HTG</keyword>
      <keyword>HTGS_PHASE2</keyword>
      <keyword>HTGS_DRAFT</keyword>
      <keyword>HTGS_FULLTOP</keyword>
   </keywords>
   <source>
      <abbreviation>house mouse.</abbreviation>
      <organism>
         <name>Mus musculus</name>
         <taxonomy>
            <class>Eukaryota</class>
            <class>Metazoa</class>
            <class>Chordata</class>
            <class>Craniata</class>
            <class>Vertebrata</class>
            <class>Euteleostomi</class>
            <class>Mammalia</class>
            <class>Eutheria</class>
            <class>Rodentia</class>
            <class>Sciurognathi</class>
            <class>Muridae</class>
            <class>Murinae</class>
            <class>Mus</class>
         </taxonomy>
      </organism>
   </source>
   <references>
      <reference number="1" from="1" to="132912">
         <authors>
            <author>Birren,B.</author>
         </authors>
         <title>Mus musculus, clone RP24-146B1</title>
         <journal>
            <location>Unpublished</location>
         </journal>
      </reference>
      <reference number="2" from="1" to="132912">
         <authors>
            <author>Birren,B.</author>
         </authors>
         <title>Direct Submission</title>
         <journal>
            <submission>02-APR-2002</submission>
            <department>Whitehead Institute/MIT Center for Genome Research, 320 Charles Street, Cambridge, MA 02141, USA</department>
         </journal>
      </reference>
      <reference number="3" from="1" to="132912">
         <authors>
            <author>Birren,B.</author>
         </authors>
         <title>Direct Submission</title>
         <journal>
            <submission>08-JUL-2002</submission>
            <department>Whitehead Institute/MIT Center for Genome Research, 320 Charles Street, Cambridge, MA 02141, USA</department>
         </journal>
      </reference>
   </references>
   <comment>
      <replaced>
         <date>Jul 8, 2002</date>
         <gi>21700645</gi>
      </replaced>
      <information title="All repeats were identified using RepeatMasker">Smit, A.F.A. ,  Green, P. (1996-1997)http://ftp.genome.washington.edu/RM/RepeatMasker.html</information>
      <information title="Center">Whitehead Institute/ MIT Center for Genome Research</information>
      <information title="Center code">WIBR</information>
      <information title="Web site">http://www-seq.wi.mit.edu</information>
      <information title="Contact">sequence_submissions@genome.wi.mit.edu</information>
      <information title="Center project name">L25104</information>
      <information title="Center clone name">146_B_1</information>
      <information title="Sequencing vector">Plasmid; n/a; 100% of reads</information>
      <information title="Chemistry">Dye-terminator Big Dye; 100% of reads</information>
      <information title="Assembly program">Phrap; version 0.960731</information>
      <information title="Consensus quality">130058 bases at least Q40</information>
      <information title="Consensus quality">131186 bases at least Q30</information>
      <information title="Consensus quality">131595 bases at least Q20</information>
      <information title="Insert size">142000; agarose-fp</information>
      <information title="Insert size">132012; sum-of-contigs</information>
      <information title="Quality coverage">6.9 in Q20 bases; agarose-fp</information>
      <information title="Quality coverage">7.5 in Q20 bases; sum-of-contigs</information>
      <information title="NOTE">This is a 'working draft' sequence. It currently consists of 10 contigs. Gaps between the contigsare represented as runs of N. The order of the piecesis believed to be correct as given, however the sizesof the gaps between them are based on estimates that haveprovided by the submittor.This sequence will be replacedby the finished sequence as soon as it is available andthe accession number will be preserved.</information>
      <information title="1     1178">contig of 1178 bp in length</information>
      <information title="1179 1278">gap of      100 bp</information>
      <information title="1279     2835">contig of 1557 bp in length</information>
      <information title="2836 2935">gap of      100 bp</information>
      <information title="2936     5385">contig of 2450 bp in length</information>
      <information title="5386 5485">gap of      100 bp</information>
      <information title="5486     8192">contig of 2707 bp in length</information>
      <information title="8193 8292">gap of      100 bp</information>
      <information title="8293    10488">contig of 2196 bp in length</information>
      <information title="10489 10588">gap of      100 bp</information>
      <information title="10589    12801">contig of 2213 bp in length</information>
      <information title="12802 12901">gap of      100 bp</information>
      <information title="12902    18716">contig of 5815 bp in length</information>
      <information title="18717 18816">gap of      100 bp</information>
      <information title="18817    34793">contig of 15977 bp in length</information>
      <information title="34794 34893">gap of      100 bp</information>
      <information title="34894    51004">contig of 16111 bp in length</information>
      <information title="51005 51104">gap of      100 bp</information>
      <information title="51105   132912">contig of 81808 bp in length.</information>
   </comment>
   <features>
      <sequence_feature type="source">
         <location>1..132912</location>
         <qualifer type="db_xref">taxon:10090</qualifer>
         <qualifer type="clone">RP24-146B1</qualifer>
         <qualifer type="clone_lib">RPCI-24 Male Mouse BAC</qualifer>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>1..1178</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>1279..2835</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>2936..5385</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>5486..8192</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>8293..10488</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>10589..12801</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>12902..18716</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>18817..34793</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>34894..51004</location>
      </sequence_feature>
      <sequence_feature type="misc_feature">
         <location>51105..132912</location>
      </sequence_feature>
   </features>
   <base_count num_a="43599" num_c="24512" num_g="23668" num_t="40195" num_others="938" />
   <sequence>mhkkiciigagaaglvsakhaikqgyqvdifeqtdqvggtwvysektgchsslykvmktn
lpkeamlfqdepfrdelpsfmshehvleylnefskdfpiqfsstvnevkrendlwkvlie
snsetitrfydvvfvcnghffeplnpyqnsyfkgklihshdyrraehytgknvvivgagp
sgiditlqiaqtanhvtliskkatypvlpesvqqmatnvksvdehgvvtdegdhvpadvi
ivctgyvfkfpfldssliqlkyndrmvsplyehlchvdypttlffiglplgtitfplfev
qvkyalsliagkgklpsddveirnfedarlqgllnpasfhviieeqweymkklakmggfe
ewnymetikklygyimterkknvigykmvnfelttdssdfklltirvdfnddvawiirfa
ypi</sequence>
</entry>

我希望通过枚举节点列表,使用Plotly和igraph库生成一个树形图。

我在这里使用这个网站作为参考。

我的XML文件包含子元素数量可变的元素。然而,给出的例子只向我展示了如何开发一个具有固定数量的子节点的树(这个例子展示了每个节点2个子节点的固定数量)

在这里查看igraph教程网站,我看到一个类似的例子,其中每个节点只使用2个子节点。

我应该如何生成一个具有可变数量子节点的树,比如在我的XML文件中?

我在这个问题上坚持了这么久,任何帮助都将不胜感激!

共有1个答案

章稳
2023-03-14

您可以像这样创建图形:

from lxml import etree
from igraph import Graph
   
root = etree.parse("entry.xml").getroot()
 
element_ids = {elem: i for i, elem in enumerate(root.iter())}

edges = []
for parent, parent_id in element_ids.items():
    for child in parent.getchildren():
        edges.append((parent_id, element_ids[child]))

G = Graph(edges)

element_ids字典将包含 XML 中的所有标记作为键和所有元素的不同 ID,如 {tag1: 0,tag2: 1,tag3: 2}。这样,您稍后将找到所有标记的 id。

我不知道如何将标签放入plotly,但对于使用igraph绘图,将标签名称作为标签添加会很有用:

names = [e.tag for e in element_ids]
G.vs['label'] = names

我没有尝试过,但图形的可视化必须与文章中的相同。

 类似资料:
  • 我已经实现了一个TreeModel来调整存量数据模型,以便将其可视化为JTree。我遇到了一个问题,节点有多个相同的子(叶)节点。例如,考虑一个JTree,它的叶节点是Strings。每当父节点包含具有相同String值的子节点时,就会出现问题。这些叶节点的TreePath是相同的,作为Strings,equals()返回true。留档明确地调用这个: JTree及其相关类广泛使用TreePath

  • 我的代码基本上来自此示例(http://corner.squareup.com/2010/07/smooth-signatures.html)和Google API(FingerPaint),但现在我想使用,以便根据手指的速度更改笔画宽度。 我想我可以把一条路径分成更小的部分,但我没有找到任何例子。还有第二篇文章(http://corner.squareup.com/2012/07/smoothe

  • 我有一个XML文档,它包含一个非常复杂(对我来说)的结构,没有换行符。它有许多具有类似结构的元素: 我需要得到节点值的文本,这是节点成员的孩子也有孩子的名字与特定的文本(在这种情况下virtual_size)。也有可能存在几个类似的节点。我可以用[1]etc吗? 这让我知道了节点的名称,但是如何达到“值”节点呢?

  • 我当前的firebase结构如下所示 本来我的保安看起来 这一切都很好,因为我可以通过 然而,我想给这个分支增加更多的安全性,这样用户就不能通过写操作(读取是可以的)更新前端的代码来修改其中的一些条目。我希望用户能够根据isAdmin标志修改这些条目,例如, companyId(读:auth!=null,写:isAdmin==true) 因为所有读取=auth!=无效的起初,我认为因为我能够读取所

  • 我有一个SVG文档,其中包含类似于以下内容的节点: 我想做的只是选择

  • 问题内容: 这是我的问题…: 在我的活动中,我有一个和一个。我希望Button仅在显示某个可绘制对象时才执行操作。是的,这意味着该代码正在各种可绘制对象之间进行动画处理,从而使其不会中断我想要完成的工作。 没用 并且我将其范围缩小到“ if(vari(drawabledrawable == acertaindrawable)”行的错误。尽管Eclipse并没有公然报告两个可绘制对象是否相同的And