目录
快速入门
Top
$html =
file_get_html(
'http://www.google.cn/');
foreach($html->
find(
'img') as $element)
echo $element->
src .
'<br>';
foreach($html->
find(
'a') as $element)
echo $element->
href .
'<br>';
$html =
str_get_html(
'<div id="hello">Hello</div><div id="world">World</div>');
$html->
find(
'div', 1)->
class =
'bar';
$html->
find(
'div[id=hello]', 0)->
innertext =
'foo';
echo $html;
echo
file_get_html(
'http://www.google.com/')->
plaintext;
$html =
file_get_html(
'http://slashdot.org/');
foreach($html->
find(
'div.article') as $article) {
$item[
'title'] = $article->
find(
'div.title',
0)->
plaintext;
$item[
'intro'] = $article->
find(
'div.intro',
0)->
plaintext;
$item[
'details'] = $article->
find(
'div.details',
0)->
plaintext;
$articles[] = $item;
}
print_r($articles);
如何创建HTML DOM 对象?
Top
$html =
str_get_html(
'<html><body>Hello!</body></html>');
$html =
file_get_html(
'http://www.google.com/');
$html =
file_get_html(
'test.htm');
$html = new
simple_html_dom();
$html->
load(
'<html><body>Hello!</body></html>');
$html->
load_file(
'http://www.google.cn/');
$html->
load_file(
'test.htm');
echo $html;
如何查找HTML元素?
Top
$ret = $html->find(
'a');
$ret = $html->find(
'a', 0);
$ret = $html->find(
'a', -1);
$ret = $html->find(
'div[id]');
$ret = $html->find(
'div[id=foo]');
$ret = $html->find(
'#foo');
$ret = $html->find(
'.foo');
$ret = $html->find(
'*[id]');
$ret = $html->find(
'a, img');
$ret = $html->find(
'a[title], img[title]');
在属性过滤器中支持如下运算符:
过滤器 | 描述 |
---|
[属性] | 匹配包含指定属性的元素. |
[!属性] | 匹配不包含指定属性的元素. |
[属性=value] | 匹配等于特定值的指定属性的元素. |
[属性!=value] | 匹配除包含特定值的指定属性之外的元素 |
[属性^=value] | 匹配包含特定前缀的值的指定属性的元素. |
[属性$=value] | 匹配包含特定后缀的值的指定属性的元素. |
[属性*=value] | 匹配包含特定值的指定属性的元素.. |
$es = $html->find(
'ul li');
$es = $html->find(
'div div div');
$es = $html->find(
'table.hello td');
$es = $html->find(
''table td[align=center]');
foreach($html->find(
'ul') as $ul)
{
foreach($ul->find(
'li') as $li)
{
}
}
$e = $html->find(
'ul', 0)->find(
'li', 0);
如何访问HTML元素的属性?
Top
$value = $e->
href;
$e->
href =
'my link';
$e->
href =
null;
if(isset($e->
href))
echo
'href exist!';
$
html = str_get_html
(
"<div>foo <b>bar</b></div>"
)
;
$e = $html->find(
"div",
0);
echo $e->
tag;
echo $e->
outertext;
echo $e->
innertext;
echo $e->
plaintext;
属性名 | 用法 |
---|
$e->tag | Read or write the tag name of element. |
$e->outertext | Read or write the outer HTML text of element. |
$e->innertext | Read or write the inner HTML text of element. |
$e->plaintext | Read or write the plain text of element. |
echo
$html->
plaintext;
$e->
outertext =
'<div class="wrap">' . $e->
outertext .
'<div>';
$e->
outertext =
'';
$e->
outertext = $e->
outertext .
'<div>foo
<div>';
$e->
outertext =
'<div>foo
<div>' . $e->
outertext;
如何遍历DOM树?
Top
echo $html->
find(
"#div1", 0)->
children(
1)->
children(
1)->
children(
2)->
id;
echo $html->
getElementById(
"div1")->
childNodes(
1)->
childNodes(
1)->
childNodes(
2)->
getAttribute(
'id');
你也可以使用
骆驼命名法调用.
方法 | 描述 |
---|
mixed
$e->children ( [int $index] ) | Returns the Nth child object if index is set, otherwise return an array of children. |
element
$e->parent () | Returns the parent of element. |
element
$e->first_child () | Returns the first child of element, or null if not found. |
element
$e->last_child () | Returns the last child of element, or null if not found. |
element
$e->next_sibling () | Returns the next sibling of element, or null if not found. |
element
$e->prev_sibling () | Returns the previous sibling of element, or null if not found. |
如何储存DOM对象中的内容?
Top
$str = $html->
save();
$html->
save(
'result.htm');
$str = $html;
echo $html;
如何自定义解析器方法?
Top
function my_callback(
$element) {
if ($element->tag==
'b')
$element->outertext = '';
}
$html->
set_callback(
'my_callback');
echo $html;