问题：

使用 Jsoup 解析 dl 标记

单淳

2023-03-14

我正在尝试解析

HTML如下所示：

<dl>
<dt>
    <span class="paramLabel">Parameters:</span>
</dt>
<dd>
    <code>y</code> - the ordinate coordinate
</dd>
<dd>
    <code>x</code> - the abscissa coordinate
</dd>
<dt>
    <span class="returnLabel">Returns:</span>
</dt>
<dd>
    the <i>theta</i> component of the point (<i>r</i>,&nbsp;<i>theta</i>) in polar coordinates that corresponds to the point (<i>x</i>,&nbsp;<i>y</i>) in Cartesian coordinates.
</dd>

我尝试了以下方法：

String title = "";
List<String> descriptions = new ArrayList<>();
for (int i = 0; i < children.size(); i++) {
    Element child = children.get(i);

    if(child.tagName().equalsIgnoreCase("dt")) {
        if(descriptions.size() != 0) {
            block.fields.put(title, descriptions);
            descriptions.clear();

        }

        title = child.text();
    }
    else if(child.tagName().equalsIgnoreCase("dd")) {
        descriptions.add(child.text());

        if(i == children.size() - 1) {
            block.fields.put(title, descriptions);
        }
    }
}

我期望得到这个:

 * Parameters -> y - the ordinate coordinate
 *               x - the abscissa coordinate
 * Returns    -> the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates.

但我明白了：

 * Parameters -> the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates.


 * Returns    -> the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates.

共有1个答案

蔡鸿骞

2023-03-14

您需要将描述列表的副本插入到地图中，当前您操作了该列表的一个实例。因此，与其说：

block.fields.put(title, descriptions);

创建新列表，例如:

block.fields.put(title, new ArrayList<>(descriptions));

类似资料：

使用Jsoup解析HTML div标记

我试图从这些div标签中获取文本，但是它们都不返回任何内容: HTML: 我想得到div类“消息”和h4标记和跨越“日期时间”中的文本，我试图：和：但是他们没有成功。
使用jsoup解析XML-防止jsoup“清除” 标签

问题内容：在大多数情况下，使用jsoup解析XML都没有问题。但是，如果有XML文档中的标签，jsoup将改变到。这样就无法使用CSS选择器提取标签内的文本。那么如何防止jsoup“清除” 标签呢？问题答案：在jsoup 1.6.2中，我添加了XML解析器模式，该模式可以按原样解析输入，而无需应用HTML5解析规则（元素内容，文档结构等）。此模式将文本保留在标签中，并允许其倍数等。这是
使用JSoup解析HTML

我想解析出这个Nasa页面上的描述，页面底部的文字我该怎么做？
使用jsoup解析html并删除标记块

问题内容：我想删除标签之间的所有内容。输入示例可能是输入：输出将是：基本上，我必须先删除整个区块谢谢，问题答案：您最好对找到的所有元素进行迭代。所以你可以保证 a。）所有元素都被删除并且 b。）如果没有元素，那么什么也做不了。例：编辑：（除了我的评论）当简单的 null /范围检查在这里足够时，请不要使用异常处理：代替：
JSoup-逐标记解析HTML标记

我实际上正在用Java开发一个文本解析器，有人要求我通过用它解析HTML来增强它。解析器的目的是将被解析的文件分成另外三个文件，一个包含文件中包含的所有单词，一个包括所有句子，另一个包含所有问题。 *.txt部分工作得很好，但我在解析HTML时遇到了一个问题。我创建了一个扩展名为*.txt的临时文件，并将其在我的文本解析器中传递，但是如果我传递一个带有HTML文件链接的URL，其格式如下所示：
Jsoup 使用DOM解析HTML

主要内容：Jsoup 使用DOM解析HTML 语法,Jsoup 使用DOM解析HTML 说明,Jsoup 使用DOM解析HTML 示例以下示例将展示在将 HTML 字符串解析为 Document 对象后如何使用类似 DOM 的方法。 Jsoup 使用DOM解析HTML 语法 document ：文档对象代表 HTML DOM。 Jsoup ：解析给定 HTML 字符串的主类。 html ： HTML 字符串。 sampleDiv ：元素对象表示由 id“sampleDiv”标识的 html

使用 Jsoup 解析 dl 标记

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档