目录
- 一、简单了解ik分词器(分词效果)
- 1.standard(单字分词器,es默认分词器)
- 2.ik_smart分词(粗粒度的拆分)
- 3.ik_max_word分词器(最细粒度拆分)
- 二、指定默认分词器
- 1.为索引指定默认分词器
- 三、ES操作数据
- 1.概述
- 2.创建索引
- 3.查询索引
- 4.删除索引
- 5.添加文档
- 6.查询索引库
- 6.1查询索引库中所有内容
- 6.2简单等值查询
- 6.3简单范围查询
- 6.4 通过id进行in查询
- 6.5分页查询
- 6.6对查询结果只显示指定字段
- 6.7排序查询
- 7.修改索引内容
- 8.删除索引内容
- 9.PUT和POST区别
一、简单了解ik分词器(分词效果)
这个是底层自带的不属于ik分词,ik分词器属于第三方分词器
1.standard(单字分词器,es默认分词器)
POST _analyze
{"analyzer":"standard","text":"我爱学搜索引擎"
}
效果(把每一个字都拆分,每个字都被分词了)
{"tokens" : [{"token" : "我","start_offset" : 0,"end_offset" : 1,"type" : "<IDEOGRAPHIC>","position" : 0},{"token" : "爱","start_offset" : 1,"end_offset" : 2,"type" : "<IDEOGRAPHIC>","position" : 1},{"token" : "学","start_offset" : 2,"end_offset" : 3,"type" : "<IDEOGRAPHIC>","position" : 2},{"token" : "搜","start_offset" : 3,"end_offset" : 4,"type" : "<IDEOGRAPHIC>","position" : 3},{"token" : "索","start_offset" : 4,"end_offset" : 5,"type" : "<IDEOGRAPHIC>","position" : 4},{"token" : "引","start_offset" : 5,"end_offset" : 6,"type" : "<IDEOGRAPHIC>","position" : 5},{"token" : "擎","start_offset" : 6,"end_offset" : 7,"type" : "<IDEOGRAPHIC>","position" : 6}]
}
2.ik_smart分词(粗粒度的拆分)
和单字分词器的区别,就是按照比较粗的粒度去分词,把搜索引擎当成一个词来分词
POST _analyze
{"analyzer":"ik_smart","text":"我爱学搜索引擎"
}
效果
{"tokens" : [{"token" : "我","start_offset" : 0,"end_offset" : 1,"type" : "CN_CHAR","position" : 0},{"token" : "爱","start_offset" : 1,"end_offset" : 2,"type" : "CN_CHAR","position" : 1},{"token" : "学","start_offset" : 2,"end_offset" : 3,"type" : "CN_CHAR","position" : 2},{"token" : "搜索引擎","start_offset" : 3,"end_offset" : 7,"type" : "CN_WORD","position" : 3}]
}
3.ik_max_word分词器(最细粒度拆分)
按照最细粒度进行分词,把认为能组成一个词的情况都拆分。
POST _analyze
{"analyzer":"ik_max_word","text":"我爱学搜索引擎"
}
效果
{"tokens" : [{"token" : "我","start_offset" : 0,"end_offset" : 1,"type" : "CN_CHAR","position" : 0},{"token" : "爱","start_offset" : 1,"end_offset" : 2,"type" : "CN_CHAR","position" : 1},{"token" : "学","start_offset" : 2,"end_offset" : 3,"type" : "CN_CHAR","position" : 2},{"token" : "搜索引擎","start_offset" : 3,"end_offset" : 7,"type" : "CN_WORD","position" : 3},{"token" : "搜索","start_offset" : 3,"end_offset" : 5,"type" : "CN_WORD","position" : 4},{"token" : "索引","start_offset" : 4,"end_offset" : 6,"type" : "CN_WORD","position" : 5},{"token" : "引擎","start_offset" : 5,"end_offset" : 7,"type" : "CN_WORD","position" : 6}]
}
二、指定默认分词器
1.为索引指定默认分词器
创建一个索引(mysql中对应database),名为test_index_database
指定默认分词器为:ik_max_word
PUT /test_index_database
{"settings":{"index":{"analysis.analyzer.default.type":"ik_max_word"}}
}
三、ES操作数据
在7.x版本以后类型默认为_doc
1.概述
es是面向文档的,它可以储存整个对象或者文档,对该文档进行索引、搜索、排序、过滤。
使用json作为文档序列化格式
2.创建索引
PUT /test_index01
3.查询索引
GET /test_index01
查询信息如下
其中number_of_shards(分片数量)
number_of_replicas(副本数量)
es7.6.1版本默认的分片和副本数量为1,这个默认数量和你es的版本有关系。可能其他版本默认不是1
{"test_index01" : {"aliases" : { },"mappings" : { },"settings" : {"index" : {"creation_date" : "1678969193239","number_of_shards" : "1","number_of_replicas" : "1","uuid" : "n6tD0dyxTB2aOQjqyDK0QQ","version" : {"created" : "7060199"},"provided_name" : "test_index01"}}}
}
4.删除索引
DELETE /test_index01
5.添加文档
格式: PUT /索引名称/类型/id
PUT /test_index01/_doc/1
{
"name": "张三",
"sex": 1,
"age": 25,
"address": "北京",
"remark": "java"
}
执行结果
_index
:索引名称
_type
:类型
_id
:id
_version
:版本(因为这条数据可能会被修改,所以版本可能不是1)
result
:结果(操作结果,创建,更新等)
{"_index" : "test_index01","_type" : "_doc","_id" : "1","_version" : 1,"result" : "created","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 0,"_primary_term" : 1
}
6.查询索引库
查询格式:GET /索引名称/类型/id
GET /test_index01/_doc/1
查询结果
{"_index" : "test_index01","_type" : "_doc","_id" : "1","_version" : 1,"_seq_no" : 0,"_primary_term" : 1,"found" : true,"_source" : {"name" : "张三","sex" : 1,"age" : 25,"address" : "北京","remark" : "java"}
}
6.1查询索引库中所有内容
格式: GET /索引名称/类型/_search
GET /test_index01/_doc/_search
相当于mysql中的 select *
结果(我这里只有一条数据)
#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "test_index01","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"name" : "秀儿","sex" : 1,"age" : 25,"address" : "上海","remark" : "java"}}]}
}
6.2简单等值查询
格式: GET /索引名称/类型/_search?q=:**
GET /test_index01/_doc/_search?q=age:25
6.3简单范围查询
格式: GET /索引名称/类型/_search?q=***[left TO tight]
GET /test_index01/_doc/_search?q=age[25 TO 26]
6.4 通过id进行in查询
格式: GET /索引名称/类型/_mget
GET /test_index01/_doc/_mget
{
"ids":["1","2"]
}
6.5分页查询
GET /索引名称/类型/_search?from=0&size=1
GET /索引名称/类型/_search?q=条件&from=0&size=1
GET /test_index01/_doc/_search?from=0&size=1
GET /test_index01/_doc/_search?q=age[25 TO 26]&from=0&size=1
6.6对查询结果只显示指定字段
GET /索引名称/类型/_search?_source=字段,字段
GET /test_index01/_doc/_search?_source=name,age
6.7排序查询
GET /索引名称/类型/_search?sort=字段 desc
GET /test_index01/_doc/_search?sort=age:desc
GET /test_index01/_doc/_search?sort=age:asc
7.修改索引内容
格式:PUT /索引名称/类型/id
PUT /test_index01/_doc/1
{
"name": "秀儿",
"sex": 1,
"age": 25,
"address": "上海",
"remark": "java"
}
结果
{"_index" : "test_index01","_type" : "_doc","_id" : "1","_version" : 2,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 1,"_primary_term" : 1
}
8.删除索引内容
格式: DELETE /索引名称/类型/id
DELETE /test_index01/_doc/1
结果
{"_index" : "test_index01","_type" : "_doc","_id" : "1","_version" : 3,"result" : "deleted","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 2,"_primary_term" : 1
}
9.PUT和POST区别
post和put都能实现创建和更新操作
①PUT:
(1)需要对一个具体的资源进行操作,所以必须要有id才能更新和创建操作。没有就会执行失败
(2)只会将json数据全都进行替换
(3)与delete都是幂等操作,无论操作多少次结果都一样
②POST:
(1)针对整个资源集合进行操作,如果不写id就会由es生成一个唯一的id进行创建文档,如果指定id则会对应创建或者更新文档。
(2)只会更新相同字段的值