问题：

获取两个html标记之间的文本

公良高刚

2023-03-14

我试图获取提供的html（跨度）之间的数据（在本例中为31）

以下是原始代码(来自chrome中的inspect elements)

<span id="point_total" class="tooltip" oldtitle="Note: If the number is black, your points are actually a little bit negative.  Don't worry, this just means you need to start subbing again." aria-describedby="ui-tooltip-0">31</span>

我有一个包含页面源代码的富文本框，下面是相同的代码，但是在富文本框的第51行:

<DIV id=point_display>You have<BR><SPAN id=point_total class=tooltip jQuery16207621750175125325="23" oldtitle="Note: If the number is black, your points are actually a little bit negative.  Don't worry, this just means you need to start subbing again.">17</SPAN><BR>Points </DIV><IMG style="FLOAT: right" title="Gain subscribers" border=0 alt="When people subscribe to you, you lose a point" src="http://static.subxcess.com/images/page/decoration/remove-1-point.png"> </DIV>

我将如何做到这一点？我已经尝试了几种方法，但似乎都不适合我。

我试图从这一页检索点值：http://www.subxcess.com/sub4sub.php根据谁潜艇你的数量变化。

共有3个答案

朱华皓

2023-03-14

有多种可能性。

正则表达式
让超文本标记语言解析为XML并通过XPath获取值
遍历所有元素。如果你得到一个跨度标签，跳过所有字符，直到你找到结尾'

另请查看System.Windows.Forms.HtmlDocument

那博瀚

2023-03-14

你会想使用HtmlAgilityPack来做到这一点，这很简单：

HtmlDocument doc = new HtmlDocument();
doc.Load("filepath");

HtmlNode node = doc.DocumentNode.SelectSingleNode("//span"); //Here, you can also do something like (".//span[@id='point_total' class='tooltip' jQuery16207621750175125325='23' oldtitle='Note: If the number is black, your points are actually a little bit negative.  Don't worry, this just means you need to start subbing again.']"); to select specific spans, etc...

string value = node.InnerText; //this string will contain the value of span, i.e. <span>***value***</span>

Regex虽然是一个可行的选项，但在解析html时，如果可能的话，您通常希望避免使用它（请参阅此处）

就可持续性而言，您需要确保理解页面源(例如，刷新几次，并在每次刷新后查看您的目标范围是否嵌套在相同的父范围内，确保页面采用相同的通用格式，等等...，然后使用上述原则导航到span)。

孔硕

2023-03-14

你可以非常具体地说出来：

var regex = new Regex(@"<span id=""point_total"" class=""tooltip"" oldtitle="".*?"" aria-describedby=""ui-tooltip-0"">(.*?)</span>");

var match = regex.Match(@"<span id=""point_total"" class=""tooltip"" oldtitle=""Note: If the number is black, your points are actually a little bit negative.  Don't worry, this just means you need to start subbing again."" aria-describedby=""ui-tooltip-0"">31</span>");

var result = match.Groups[1].Value;

类似资料：

Jsoup在两个标记之间获取html

在像这样的网站上http://wikitravel.org/en/San_Francisco，诸如“Districts”、“Understand”、“Get in”等部分实际上并不包含HTML中的整个部分。节实际上只是标题中的跨类。正因为如此，我们不能简单地通过选择id来获取wiki文档的某些部分。但是，是否可以收集两个标记之间的所有html？比如说我想要“四处走动”部分。我该如何发出一个选择器
Jsoup从两个标记之间的html获取数据

我正在从事一个个人项目，希望解析这个html并从中检索信息。基本上，我希望获得标记中给出的所有信息，为此，我在java中使用JSOUP。我使用这段代码来获取，但这是在一个段落中给出所有值。我也试过了但他的观点是空泛的。有人能帮我以更好的方式获得这些数据吗？
读取 xml 文件中 2 个 html 标记之间的文本 [重复]

我正在尝试读取 xml 文件中 2 个 html 标签之间的文本。它适用于小内容，但是当内容很大时，它会失败并返回空。在高级别上可能是什么问题
获取两个圆括号之间的文本

如何使用JavaScript中的正则表达式从下面句子中两个圆括号之间检索单词？ “这是（我的）简单文本”
Jsoup：在标记之前获取最近的html标记

我去一个标签名
正则表达式提取HTML标记之间的文本

我正在寻找一个正则表达式，它必须在不同类型的HTML标记之间提取文本。对于前任： <代码> <代码> <代码> 我找到了这个特殊的片段

获取两个html标记之间的文本

共有3个答案

相关问答

相关文章

相关阅读

相关工具

相关文档