当前位置: 首页 > 编程笔记 >

用PHP解析XML

鲁彬炳
2023-03-14
本文向大家介绍用PHP解析XML,包括了用PHP解析XML的使用技巧和注意事项,需要的朋友参考一下

XML数据提取可能是一项常见的任务,但是要直接使用此数据,您需要了解PHP如何解析XML。在PHP中解析XML涉及各种不同的功能,所有这些功能协同工作以从XML文档中提取数据。我将完成所有这些功能,并在最后将它们联系在一起。

xml_parser_create()

此函数用于创建解析器对象,该对象将在其余过程中使用。该对象用于存储数据和配置选项,并传递给所涉及的每个功能。

$xml_parser = xml_parser_create();

xml_set_element_handler()

接下来,我们需要设置将在脚本解析中使用的函数。该xml_set_handler()方法采用以下参数:

  • XML解析器参考:这是对使用xml_parser_create创建的解析器的参考function()。

  • 起始元素:这是对函数的回调引用,当解析器运行时找到起始元素时将调用该函数。

  • End element:这是对函数的回调引用,当解析器运行时找到end元素时将调用该函数。

最后两个参数必须是具有特定占用空间的函数。这意味着它们需要具有正确的参数编号,但是您可以随心所欲地调用它。这是对该函数的调用示例xml_set_element_handler()。

xml_set_element_handler($xml_parser, "startElement", "endElement");

的startElement()和endElement()功能将自动由XML解析器对象时,事情在运动设定被调用。

startElement() 功能

在该函数的调用上方,xml_set_element_handler()您需要设置一个读取起始元素数据的方法。该方法必须具有以下参数:

  • 解析器:这是在对xml_parser_create的调用中创建的xml解析器对象。

  • Name:开始元素的名称。

  • Attribs:这是属性的关联数组的开始元素包含。

因此,您的函数可能看起来像这样:

function startElement($xmlParser, $name, $attribs) {
    echo "Start: " . $name ."<br />";
}

所有这些操作将打印出元素的名称,但是您可以做更多的事情。例如,假设您的元素之一被称为

, you can use an if or switch statement to store this value in a variable for use later. Like this:<code><pre>function startElement($xmlParser, $name, $attribs) {    global $variable;    switch ($name) {        case 'title':            $variable = $name;            break;    } } </pre></code><p>Remember that you will need to put this function declaration BEFORE the call for <code>xml_set_element_handler()</code>, PHP needs to know about this method so that it can point the parser towards it.</p><h4><code>endElement()</code> function</h4><p>This function is called when the parser encounters a xml closing element. In an opposite operation as before you might need to clear the variable you stored during the start element function. Again this decleration MUST be before the call for xml_set_element_handler. Note that if the tag is self closing then there will be no end element. The function must have the following parameters.</p><ul><li><strong>xml_parser</strong>: The parser created in the call to xml_parser_create.</li><li><strong>name</strong>: The name of the element.</li></ul><p>The following code will just print of the name of the end element, you can use this function to overright anything that may have happened in the startElement function. For example, you may have set a value in the <code>startElement()</code> to keep track of the depth of the parser into the XML document, you can use this method to reduce it. This might be important if there is more than one element with the same name, but in a different context.</p><code><pre>function endElement($parser, $name) { echo "End: " . $name . "<br />"; } </pre></code><h3><code>xml_set_character_data_handler()</code></h3><p>The next function to call is xml_set_character_data_handler. This takes two parameters:</p><ul><li><strong>xml_parser</strong>: This is a callback reference to the xml parser that was created in the call to xml_parser_create.</li><li><strong>characterData</strong>: This is a callback reference to the method that will be called when character data is found.</li></ul><p>This function works in the same way as the <code>xml_set_element_handler()</code> funct<a href="/zhuanti/35.html" target="_blank">io</a>n in that it simply sets a reference to the function that will be called when character data is encountered. The function is called like this.</p><code>xml_set_character_data_handler($xml_parser, "characterData");</code><h4><code>characterData()</code> function</h4><p>The <code>characterData()</code> function, which again MUST be placed before the call to <code>xml_set_character_data_handler()</code> and must also have the following parameters.</p><ul><li><strong>xml_parser</strong>: The reference to the xml parser created in the call to xml_parser_create.</li><li><strong>data</strong>: The data held within the XML element. Any CDATA tags have been used then the parser will return everything between those tags so no need to worry about cutting them out.</li></ul><p>So when the parser object finds a data object this method is called. The following function will just print out the data.</p><code><pre>function characterData($parser, $data) { echo "Data: " . $data . "<br />"; } </pre></code><p>One thing that it is essential that you look out for is the funny thing that the parser does when it encounders certain conditions. It will stop parsing and call the function again. This repeats until all of the data has been passed. I've listed (I think) all of the conditions below.</p><ul><li>The parser runs into an Entity Declaration, such as & (&) or ' (')</li><li>The parser finishes parsing an entity.</li><li>The parser runs into the new-line character (\n)</li><li>The parser runs into a series of tab characters (\t)</li><li>The content of the $data parameter is more than 1024 (bytes).</li></ul><p>The best way to explain this is to use an example. Lets say that you have the following string as part of the data.</p><code><pre>some text& some more text' last bit of text </pre></code><p>If you used the previous example method of just printing out the information then the parser will print out the following:</p><code><pre>Data: some text Data: & Data: some more text Data: ' Data: last bit of text </pre></code><p>So be sure that when you call the method to make sure that all of the character data is passed through. One thing you could do is to have the <code>characterData()</code> function add the data to a string. The string is initialised when the startElement function is called and printed off when the endElement function is called.</p><h3><code>xml_parser_set_option()</code></h3><p>This method is optional and can be used if you want the parser to have a certain behaviour. For example, to turn off case folding on the parser use the following code.</p><code>xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, false);</code><p>Case folding is basically the turning of characters to their uppercase equivalent. However, in XML all tags must be lowercase so and for some reason the default of the parser is for this to be on. So if you create w3c valid XML make sure that you use this function to turn off case folding. Here is a list of the available options for this function.</p><ul><li><strong>XML_OPTION_CASE_FOLDING</strong>: (integer) Controls whether case-folding is enabled for this XML parser. Enabled by default.</li><li><strong>XML_OPTION_SKIP_TAGSTART</strong>: (integer) Specify how many characters should be skipped in the beginning of a tag name.</li><li><strong>XML_OPTION_SKIP_WHITE</strong>: (integer) Whether to skip values consisting of whitespace characters.</li><li><strong>XML_OPTION_TARGET_ENCODING</strong>: (string) Sets which target encoding to use in this XML parser. By default, it is set to the same as the source encoding used by <code>xml_parser_create()</code>. Supported target encodings are ISO-8859-1, US-ASCII and UTF-8.</li></ul><h3><code>xml_parse()</code></h3><p>This function is used to run the parser over some input. It takes the following parameters:</p><ul><li><strong>xml_parser</strong>: This is a xml parser <a href="/zhuanti/152.html" target="_blank">object</a> created in the <code>xml_parser_create()</code> function.</li><li><strong>data</strong>: A chunk of data to parse. This can be read from a file or a stream.</li><li><strong>end</strong>: (optional) If this is set to true then this is the last bit of data from the source and so this is the last time the function will be run.</li></ul><p>As you can see the <code>xml_parse()</code> function can be run over and over again until all of the data has been read from the file.</p><code><pre>if (!($fp = fopen("an_xmfile.xml", "r"))) {    die("could not open XML input"); } <a href="/zhuanti/593.html" target="_blank">while</a> ($data = fread($fp, 4096)) {    if  (!xml_parse($xml_parser, $data, feof($fp))){        die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser)));    } } </pre></code><h3><code>xml_parser_free()</code></h3><p>As the name suggests this function is called at the end of the XML parsing run. It basically just clears up the memory and throws away the XML parser object created at the start.</p><h3>Putting them all together</h3><p>Just as an example I have put the code together into something that will spit out XML into formatted HTML, albeit a little ugly. It is designed to allow you to expand upon to create your own XML parsing script.</p><code><pre>// 起始元素功能 function startElement($xmlParser, $name, $attribs) { echo "Start: " . $name . "<br />"; } // 结束元素功能 function endElement($parser, $name) { echo "End: " . $name . "<br />"; } function characterData($parser, $data) { echo "Data: " . $data . "<br />"; } $xml_parser = xml_parser_create(); xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, false); xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "characterData"); if (!($fp = fopen("an_xml_file.xml","r"))) {    die("could not open XML input"); } while ($data = fread($fp, 4096)) {    if (!xml_parse($xml_parser, $data, feof($fp))) {        die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser)));    } } </pre></code><p> </p>
 类似资料:
  • 问题内容: 我正在尝试从Google购物API中提取数据。我能够成功下载数据,但是在使用PHP解析数据时遇到了麻烦。我仍在学习,但是我似乎在多维数组方面遇到问题。我使用捕获JSON 。 以下只是呼应外部数组,但我不能从内部数组中提取: 如果要获取每个产品的“标题”,“描述”,“品牌”和“可用性”,我将如何解析? 问题答案: 您的JSON是数组 和对象 的混合 体 。因此,数组表示法不适用于所有项目

  • 问题内容: 我想用一种特殊的方式解析一个CSS文件。 例: : 我希望PHP返回给我每个名称中包含postclass的类名。 在此示例中,结果看起来像一个数组: 但是我在正则表达式方面更糟。以某种方式搜索“ postclass”,然后抓取孔线并放入数组中。 谢谢,我用它来解析类似confic文件的css文件。 是我的最终代码。这样我就可以在不编辑布局的情况下轻松地将div包裹在一些hardcode

  • 问题内容: 我已经多次解析JSON数据,但是由于某种原因,无法找到嵌套数据时要使用的正确语法。我正在尝试从此JSON解析“资产”,但是无论我尝试什么,都继续获取为foreach()提供的无效参数。 我希望这是… 问题答案: 来自php官方文档:http : //php.net/manual/fr/function.json- decode.php 第二个func arg用于assoc数组返回。如果

  • 问题内容: 如何解析HTML / XML并从中提取信息? 问题答案: 本机XML扩展 我更喜欢使用本机XML扩展]之一,因为它们与PHP捆绑在一起,通常比所有第3方库都快,并为我提供了所需的所有标记控制权。 DOM DOM扩展使您可以使用PHP 5通过DOM API通过XML文档进行操作。它是W3C的Document Object Model Core Level 3的实现,它是一种平台和语言无关

  • simplexml_load_string()似乎不是以下xml的工作形式 上面的xml是响应的一部分,因此当得到结果时,它只包含属性 结果是SimpleXMLElement的一个对象,只有一个属性数组 它没有任何与“Chat_valiable”或隐藏相关的内容。 能找个人帮忙吗

  • 问题内容: 想知道为什么我的PHP代码不会在JSON数据中显示“值”的所有“值”: 第一个foreach工作正常,但第二个抛出错误。 问题答案: 您可能想要执行以下操作: