//创建索引
public function create_index(){
$params = [
'index' => 'my_index',
'body' => [
'settings' => [
'number_of_shards' => 2,
'number_of_replicas' => 0,
]
]
];
$client = ClientBuilder::create()->build();
$response = $client->indices()->create($params);
var_dump($response);
}
//修改配置
public function put_setting(){
$params = [
'index' => 'person',
'body' => [
'settings' => [
'number_of_replicas' => 10,
]
],
];
$client = ClientBuilder::create()->build();
var_dump($client->indices()->putSettings($params));
}
创建好的索引分片是无法通过put_setting来修改的,这个是一个坑,要求我们在创建索引之处就要好好规划这个结构及容量,否则之后的扩容过程会比较辛苦
//将修改mapping
public function put_mapping(){
$mapping = [
'properties' => [
'address' => [
'type' => 'keyword',
],
'email' => [
'type' => 'keyword',
]
]
];
$params = [
'index' => 'person',
'type' => 'doc',
'body' => $mapping,
];
$client = ClientBuilder::create()->build();
var_dump($client->indices()->putMapping($params));
}
如果要对已存在的索引进行修改,与创建时有所不同,要指出修改的mapping类型,这里还要有一个地方要注意,那就是修改的mapping,新增的字段是追加的形式放入es里的,之前存在的并不会消失。
//批量创建文档
public function bulk_create_another(){
$params = [
'index' => 'person',
'type' => 'doc',
'body' => [],
];
for ($i =1; $i<=10;$i++){
$params['body'][] = [
'create' => [ //index 与 create一致都是创建文档
'_id' => $i,
]
];
$params['body'][] = [
'name' => 'PHPerJiang'.$i,
'age' => $i,
'sex' => $i%2,
];
}
$client = ClientBuilder::create()->build();
var_dump($client->bulk($params));
}
//批量更新
public function bulk_update_another(){
$params = [
'index' => 'person',
'type' => 'doc',
'body' => []
];
for($i = 1; $i <= 10; $i++){
$params['body'][] = [
'update' => [
'_id' => $i
]
];
$params['body'][] = [
'doc' => [
'name' => 'PHPerJiang'.$i*2,
'age' => $i*3,
'sex' => $i%2,
]
];
}
$client = ClientBuilder::create()->build();
var_dump($client->bulk($params));
}
//批量删除
public function bluk_delete_another(){
$params = [
'index' => 'person',
'type' => 'doc',
'body' => [],
];
for ($i = 1; $i <= 10; $i++){
$params['body'][] = [
'delete' => [
'_id' => $i,
]
];
}
$client = ClientBuilder::create()->build();
var_dump($client->bulk($params));
}
批量增删改,要注意批量参数中body的写法,指出索引、类型、身体,身体中的操作分为连两部分,一部分是条件,一部分是数据。另外要注意的就是修改和产出操作,身体的第二部分数据部分要指明索引,否则es会报错,而新增数据参数中的第二部分不需要志宁索引
部分修改文档
//部分更改doc,若 body 参数中指定一个 doc 参数。这样 doc 参数内的字段会与现存字段进行合并。
public function update_doc(){
$params = [
'index' => 'person',
'type' => 'doc',
'id' => 2,
'body' => [
'doc' => [
'bbb' => '3'
]
]
];
$client = ClientBuilder::create()->build();
var_dump($client->update($params));
}
body参数中若指出doc参数,则会将es现有的字段与doc中的字段合并,相当于php的array_merge()函数,即es中如果没有这个字段则会创建。
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'body' => [
'script' => 'ctx._source.counter += count',
'params' => [
'count' => 4
]
]
];
$response = $client->update($params);
PHP-ElasticSearch文档中是如上写的,经过我实际应用发现是个坑,按照以上写法会报错找不到参数count,正确的写法应该是如下
//使用脚本更新数据
public function update_doc_by_script(){
$params = [
'index' => 'person',
'type' => 'doc',
'id' => 2,
'body' => [
'script' => [
'lang' => 'painless',
'source' => 'ctx._source.age += params.count',
'params' => ['count' => 1],
]
]
];
$client = ClientBuilder::create()->build();
var_dump($client->update($params));
}
将参数放入script参数内才可以,表示开始对文档有深深的怀疑了。
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'body' => [
'script' => 'ctx._source.counter += count',
'params' => [
'count' => 4
],
'upsert' => [
'counter' => 1
]
]
];
$response = $client->update($params);
第一点文档中的script使用方法不对,首先我们先把script给修正以下,如下代码,注意下列代码中的age1字段在es中是没有的。
$params = [
'index' => 'person',
'type' => 'doc',
'id' => 8,
'body' => [
'script' => [
'lang' => 'painless',
'source' => "ctx._source.age1 += params.count",
'params' => [
'count' => 5,
],
],
'upsert' => [
'count' => 1
]
],
];
当我们执行如上脚本的时候,会报错找不到这个字段
Message: {"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[first-node][127.0.0.1:9300][indices:data/write/update[s]]"}],"type":"illegal_argument_exception","reason":"failed to execute script","caused_by":{"type":"script_exception","reason":"runtime error","script_stack":["ctx._source.age1 += params.count"," ^---- HERE"],"script":"ctx._source.age1 += params.count","lang":"painless","caused_by":{"type":"null_pointer_exception","reason":null}}},"status":400}
实际上就是这个upsert参数没有生效,这是文档里的第二个错误。正确的写法应该如下
$params = [
'index' => 'person',
'type' => 'doc',
'id' => 8,
'body' => [
'script' => [
'lang' => 'painless',
'source' => "ctx._source.age1 = (ctx._source.age1 ?: 2) + params.count",
'params' => [
'count' => 5,
],
],
],
];
我们在script脚本中判断是否存在这个age1字段,如果存在则执行后面的累加,如果不存在则给它一个默认值2,并且此时会在es的索引中会加入此字段。这里要注意 script中出现的 ?: 是painless中特定的语法,详情看https://www.elastic.co/guide/en/elasticsearch/reference/5.4/modules-scripting-painless-syntax.html
搜索的bool查询:filter\should\must\must_not
public function search_complex(){
$params = [
'index' => 'person',
'type' => 'doc',
'body' => [
'query' => [
'bool' => [
'filter' => [
'term' => ['age1' => 22]
],
'must' => [
['term' => ['age' =>8]],
['term' => ['sex' =>0]]
],
],
],
],
];
$client = ClientBuilder ::create() -> build();
echo json_encode($client -> search($params));
}
搜索分为过滤filter 和查询 must\must_not\should,其中在bool参数下单独使用filter则不会打分,单独使用must\must_not\should或与filter与前面三个方式组合查询会返回参数。如果想使用filter查询又想获取相关性的得分,有以下两种方式可以实现:
//方式一
$params = [
'index' => 'person',
'type' => 'doc',
'body' => [
'query' => [
'bool' => [
'filter' => [
'term' => ['age1' => 22]
],
'must' => [
'match_all' => new stdClass()
]
],
],
],
];
//方式二
$params = [
'index' => 'person',
'type' => 'doc',
'body' => [
'query' => [
'constant_score' => [
'boost' => 2,
'filter' => [
'term' => ['sex' => 0]
],
],
],
],
];
方式一是使用的must与filter组合查询,must中使用match_all匹配全部,相当于过滤filter后文档的全体。方式二是用的contanst_score,它取代了bool,这样过滤后的文档得分会被置为1,配合boost权重,可以给某一个查询过滤增加权重来分配不同的得分。