安装
- 下载:curl -L -O https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.4/elasticsearch-2.3.4.tar.gz
- 解压:tar -xvf elasticsearch-2.3.4.tar.gz
- cd elasticsearch-2.3.4/bin
- 运行:./elasticsearch
在此,运行es必须在非root用户下,否则会报错:不能在root用户下启动elasticsearch,所以通过以下步骤增加es用户,并修改拥有和组策略。
1. adduser es #添加es用户
2. chown -R es elasticsearch-2.3.4
3. chgrp -R es elasticsearch-2.3.4
可以指定cluster和node的名称:
./elasticsearch --cluster.name my_cluster_name --node.name my_node_name
启动日志如下:
[INFO ][node ] [Bling] version[2.3.4], pid[28352], build[e455fd0/2016-06-30T11:24:31Z]
[node ] [Bling] initializing ...
[INFO ][plugins ] [Bling] modules [lang-groovy, reindex, lang-expression], plugins [], sites []
[INFO ][env ] [Bling] using [1] data paths, mounts [[/ (/dev/vda2)]], net usable_space [14.2gb], net total_space [23.6gb], spins? [possibly], types [ext4]
[INFO ][env ] [Bling] heap size [989.8mb], compressed ordinary object pointers [true]
[node ] [Bling] initialized
[node ] [Bling] starting ...
[transport ] [Bling] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[discovery ] [Bling] elasticsearch/2Rrwk_-SSqW0dMIrOYEfIg
[cluster.service ] [Bling] new_master {Bling}{2Rrwk_-SSqW0dMIrOYEfIg}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[http ] [Bling] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}
[node ] [Bling] started
[gateway ] [Bling] recovered [0] indices into cluster_state
注意:需要修改elasticsearch.yml文件中的network.host: 127.0.0.1属性,否则只能通过回环地址访问。
集群
在了解了cluster和node之后,需要了解如何与它们进行交互。es提供了一套很容易理解和强大的REST API,API提供以下功能
1. 检测cluster,node和index的健康情况,状态和统计。
2. 管理cluster,node和index等。
3. 提供在index上的crud和搜索操作。
4. 提供高级搜素如:分页,排序,过滤,scripting(这个不知道怎么翻译),聚合等。
若为集群配置host,需要以下配置进行扫描
discovery.zen.ping.unicast.hosts: [“需要扫描的ip1”, “需要扫描的ip1]
检测集群监控情况:
curl 'localhost:9200/_cat/health?v'
结果:
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1469435794 16:36:34 elasticsearch green 1 1 0 0 0 0 0 0 - 100.0%
curl 'http://localhost:9200/_cluster/health?pretty'
以json的形式返回结果
健康情况包括green,yellow,red。黄色意味着所有节点均可以提供服务,但是有些复制节点未分配;红色意味着某些数据不可以用,不能提供服务,但整体服务并不会中断。
curl 'localhost:9200/_cat/nodes?v'
host ip heap.percent ram.percent load node.role master name
127.0.0.1 127.0.0.1 5 97 0.91 d * Doctor Druid
展示目录信息:结果说明并不包含任何数据信息。
curl 'localhost:9200/_cat/indices?v'
health index pri rep docs.count docs.deleted store.size pri.store.size
创建索引
curl -XPUT 'localhost:9200/customer?pretty' #使用put方法创建一个名为customer的索引
curl 'localhost:9200/_cat/indices?v'
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open customer 5 1 0 0 650b 650b
状态是yellow的一个原因是rep没有分配,会在稍后一个新的node加入后得到分配,从而使状态变为green。
索引和查询
curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
{
"name": "John Doe"
}'
返回结果:
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"created" : true
}
curl -XGET 'localhost:9200/customer/external/1?pretty'
{
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : { "name": "John Doe" }
}
删除索引
curl -XDELETE 'localhost:9200/customer?pretty'
{
"acknowledged" : true
}
curl 'localhost:9200/_cat/indices?v'
health index pri rep docs.count docs.deleted store.size pri.store.size
API总结为以下形式语言
curl -X<REST Verb> <Node>:<Port>/<Index>/<Type>/<ID>
更新索引
覆盖式更新字段
curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
{
"name": "Jane Doe"
}'
如果ID的位置指定了一个新值,或者未指定,会在es中插入一条新记录,同时如果未指定id,将由es随机生成一个。
基于文档更新
curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '
{
"doc": { "name": "Jane Doe", "age": 20 }
}'
修改name字段,同时增加新字段age
默认,在更新和查询到结果间会有1s延迟
批量处理_bulk
创建两个文档
curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
{"index":{"_id":"1"}}
{"name": "John Doe" }
{"index":{"_id":"2"}}
{"name": "Jane Doe" }
'
更新文档1,删除文档2
curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}
'
批处理命令中,某条命令失败,不会 影响后续。最后会有返回结果状态,可用于检测各条命令是否成功执行。
查询
json形式的语言,我们称之为领域特定语言DSL。
# REST request URI
curl 'localhost:9200/bank/_search?q=*&pretty'
# REST request body
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} }
}'
# 查询第11-20条,按balance降序
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"from": 10,
"size": 10,
"sort": { "balance": { "order": "desc" } }
}'
一些场景查询
"query": { "match": { "address": "mill lane" } }
"query": { "match_phrase": { "address": "mill lane" } }
与
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
或
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
均不满足,均不是
"query": {
"bool": {
"must_not": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
逻辑组合,既是,同时又不是
"query": {
"bool": {
"must": [
{ "match": { "age": "40" } }
],
"must_not": [
{ "match": { "state": "ID" } }
]
}
}
filter
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"must": { "match_all": {} },
"filter": {
"range": {
"balance": {
"gte": 20000,
"lte": 30000
}
}
}
}
}
}'
过滤(filter)是一个范围(range),balance字段,大于或等于(gte,greater than or equal)2000,且小于或等于(lte,lesst than or equal)3000
聚合aggregation
类似于sql中的group by或者是聚合函数(aggregation function)。
在es中,我们可以在获取到命中数据的同时,获取到聚合数据结果。