elasticsearch 的基本操作多维度分享

news/2024/11/20 23:27:38/

目录

一、索引操作

二、映射操作

三、文档操作


elasticsearch 的基本操作多维度分享此篇正式分享,具体包括索引、映射、文档的相关处理,模拟生成环境,通过DSL语句和java的高级REST形式全方位展示给大家;

一、索引操作

1、创建索引:设置分片和副分片

PUT /nandao_scenic
{"settings": {"number_of_shards": 5,"number_of_replicas": 2} ,"mappings": {"properties": {"title":{"type": "text"},"city":{"type": "keyword"},"price":{"type": "double"}}}
}

2、删除索引

DELETE /nandao_scenic

3、关闭索引

POST /nandao_scenic/_close

4、打开索引

POST /nandao_scenic/_open

还有很多类似的DSL语句,具体可以参考:

 5、索引别名:

每个月一个索引,我要查1、2月份的数据,只能建立别名索引,即通过一个索引去查数据

5.1、一月份的索引和数据

PUT /nandao_one_log
{"mappings": {"properties": {"scenic_id":{"type": "keyword"},"uid":{"type": "keyword"},"check_in_date":{"type": "keyword"}}}
}POST /nandao_one_log/_doc/01
{"uid":"01","scenic_id":"1212","check_in_date":"2023-01-23"
}

5.2、二月份的索引和数据

PUT /nandao_two_log
{"mappings": {"properties": {"scenic_id":{"type": "keyword"},"uid":{"type": "keyword"},"check_in_date":{"type": "keyword"}}}
}POST /nandao_two_log/_doc/01
{"uid":"01","scenic_id":"1552","check_in_date":"2023-02-23"
}

5.3、构建别名索引

POST /_aliases
{"actions": [{"add": {"index": "nandao_one_log","alias": "last_two_month"}},{"add": {"index": "nandao_two_log","alias": "last_two_month"}}]
}

5.4、通过别名索引查询数据

GET /last_two_month/_search
{"query": {"term": {"uid": "01"}}
}#查询结果,1、2月份的数据
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 2,"successful" : 2,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.6931471,"hits" : [{"_index" : "nandao_one_log","_type" : "_doc","_id" : "01","_score" : 0.6931471,"_source" : {"uid" : "01","scenic_id" : "1212","check_in_date" : "2023-01-23"}},{"_index" : "nandao_two_log","_type" : "_doc","_id" : "01","_score" : 0.2876821,"_source" : {"uid" : "01","scenic_id" : "1552","check_in_date" : "2023-02-23"}}]}
}

 6、向别名索引中写数据-错误示例

POST /last_two_month/_doc/02
{"uid":"02","scenic_id":"12232","check_in_date":"2023-02-23"
}

此时是报错的,不知道往哪个目标索引写数据,设置目标索引:

POST /_aliases
{"actions": [{"add": {"index": "nandao_one_log","alias": "last_two_month","is_write_index":true}} ]
}

 设置完成后再通过别名索引写数据就可以成功了!

7、间接添加数据分片

数据分片不能改变,但是随着业务数据的急剧增长,需要增加分片,这时候我们只能通过别名索引间接添加分片

7.1、最初的索引

PUT /nandao_scenic_1
{"settings": {"number_of_shards": 5,"number_of_replicas": 2} ,"mappings": {"properties": {"title":{"type": "text"},"city":{"type": "keyword"},"price":{"type": "double"}}}
}

7.2、最初的别名索引

POST /_aliases
{"actions": [{"add": {"index": "nandao_scenic_1","alias": "nandao_scenic_a"}} ]
}

 7.3、新建更多分片的索引:

PUT /nandao_scenic_2
{"settings": {"number_of_shards": 10,"number_of_replicas": 2} ,"mappings": {"properties": {"title":{"type": "text"},"city":{"type": "keyword"},"price":{"type": "double"}}}
}

当数据向nandao_scenic_2索引写完后,就可以正式切换索引了! 

7.4、最终的别名索引

POST /_aliases
{"actions": [{"remove": {"index": "nandao_scenic_1","alias": "nandao_scenic_a"}} ,{"add": {"index": "nandao_scenic_2","alias": "nandao_scenic_a"}} ]
}

二、映射操作

1、查看索引映射

GET /nandao_scenic/_mapping#结果
{"nandao_scenic" : {"mappings" : {"properties" : {"city" : {"type" : "keyword"},"price" : {"type" : "double"},"title" : {"type" : "text"}}}}
}

2、扩展映射:

POST /nandao_scenic/_mapping
{"properties":{"tag":{"type":"keyword"}}
}

3、基本数据类型

3.1、keyword 类型是不进行切分的字符串类型,一般用来描述姓名、类型、用户id、URL和状态码等

3.2、text 类型是可进行切分的字符串类型,一般用match匹配搜索,如果用term查询,会查不到数据的。

3.3、数值类型包括long、integer、short、byte、double、float等常用类型,一般用gte\lte等参数查询,即范围查询。

3.4、布尔类型,是或否的意思,比较容易理解。

3.5、日期类型,date,可以使用范围查询,20221212类型,尽量不要format类型,处理历史数据容易出错。

添加扩展映射:

#继续扩展映射
POST /nandao_scenic/_mapping
{"properties":{"full":{"type":"boolean"},"creata_time":{"type":"date"}}
}

查询看到所有常用的基本数据类型:

{"nandao_scenic" : {"mappings" : {"properties" : {"city" : {"type" : "keyword"},"creata_time" : {"type" : "date"},"full" : {"type" : "boolean"},"price" : {"type" : "double"},"tag" : {"type" : "keyword"},"title" : {"type" : "text"}}}}
}

 4、复杂的数据类型

4.1、数组类型

比如上面的tag字段时keyword类型,但是可以存数组,后期搜索时对应原来的term,精准完全匹配原则,也是用相应的数组数据去查询,如果需要的话:

POST /nandao_scenic/_doc/002
{"title":"五台山","city":"山西","price":"88.8","tag":["有车位","人多"]
}

查询语句

GET /nandao_scenic/_search
{"query": {"term": {"tag": {"value": "人多"}}}
}

 4.2、对象类型

对象和数组一样不用事先定义,写入时自动转化成对象类型,同时对应的索引映射中也有相应的变化,比如:

POST /nandao_scenic/_doc/001
{"title":"五山","city":"江西","price":"88.8","comment":{  //对象数据"properties":{"good":88,  //好评"bad":7    //差评}} 
}

查询方式:比如好评大于10的景区

GET /nandao_scenic/_search
{"query": {"range": {"comment.properties.good": {"gte": 10}}}
}

 4.3、地理类型

互联网时代位置定位越来越多,es顺势也支持地理类型,扩展相应索引

POST /nandao_scenic/_mapping
{"properties":{"location":{"type":"geo_point"}}
}

添加包含地理类型数据

POST /nandao_scenic/_doc/003
{"title":"雍和宫","city":"北京","price":"88.8","location":{"lat":80.01234,"lon":118.98756}
}

 5、动态映射

5.1、多字段处理:

对于一个相同的字段有时需要不同的数据类型,即不同的业务选择不同的查询方式,比如在用户查询过程,既要根据用户查询,又要进行模糊查询或者姓氏查询,此时我们在mapping映射过程建立text和keyword两份索引,其中keyword作为子字段类型:

PUT /nandao_scenic_order
{"mappings": {"properties": {"order_id":{"type": "keyword"},"user_id":{"type": "keyword"},"user_name":{"type": "text","fields": {"user_name_keyword":{"type": "keyword"}}},"scenic_id":{"type": "keyword"}}}
}

写入多字段数据

#写入多字段数据
POST /nandao_scenic_order/_doc/001
{"order_id":"009","user_id":"u001","user_name":"nandao","scenic_id": "s001"
}POST /nandao_scenic_order/_doc/002
{"order_id":"008","user_id":"u002","user_name":"lisi","scenic_id": "s002"
}POST /nandao_scenic_order/_doc/003
{"order_id":"0011","user_id":"u003","user_name":"wang wu","scenic_id": "s003"
}POST /nandao_scenic_order/_doc/004
{"order_id":"0010","user_id":"u004","user_name":"su wu","scenic_id": "s004"
}

 查询并排序

GET /nandao_scenic_order/_search
{"query": {"match": {"user_name": "wu"}},"sort": {"user_name.user_name_keyword": "desc"}
}

查询结果:

{"took" : 369,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "nandao_scenic_order","_type" : "_doc","_id" : "003","_score" : null,"_source" : {"order_id" : "0011","user_id" : "u003","user_name" : "wang wu","scenic_id" : "s003"},"sort" : ["wang wu"]},{"_index" : "nandao_scenic_order","_type" : "_doc","_id" : "004","_score" : null,"_source" : {"order_id" : "0010","user_id" : "u004","user_name" : "su wu","scenic_id" : "s004"},"sort" : ["su wu"]}]}
}

倒序、正序均可以排列查询。

三、文档操作

1、单条文档写入:上面已经演示,此处演示不指定文档id,即_doc后面不加id即可


POST /nandao_scenic_order/_doc
{"order_id":"0010","user_id":"u004","user_name":"su wu","scenic_id": "s004"
}

REST客户端java语言展示

 public void singleIndexDoc(Map<String, Object> dataMap, String indexName, String indexId) {IndexRequest indexRequest = new IndexRequest(indexName).id(indexId).source(dataMap);//构建IndexRequest对象并设置对应的索引和_id字段名称try {IndexResponse indexResponse = client.index(indexRequest, RequestOptions.DEFAULT);//执行写入String index = indexResponse.getIndex();//通过IndexResponse获取索引名称String id = indexResponse.getId();//通过IndexResponse获取文档IdLong version = indexResponse.getVersion();//通过IndexResponse获取文档版本System.out.println("index=" + index + ",id=" + id + ",version=" + version );} catch (Exception e) {e.printStackTrace();}}

 2、批量写入

DSL语言:

 POST /_bulk
{"index":{"_index":"nandao_scenic","_id":"7"}}
{"title":"少林寺","city":"河南","price":"88.8","tag":["有位","人少"]}
{"index":{ "_index":"nandao_scenic", "_id":"8"}}{"title":"动物园","city":"河南","price":"80.8","tag":["有位","人少"]}

注意:一行一行的写,不能结构化,否则报错。 

java语言:

    //批量写入索引public void bulkIndexDoc(String indexName, String docIdKey, List<Map<String, Object>> recordMapList) {BulkRequest bulkRequest = new BulkRequest(indexName);//构建批量操作BulkRequest对象for (Map<String, Object> dataMap : recordMapList) {//遍历数据String docId = dataMap.get(docIdKey).toString();//获取主键作为Elasticsearch索引的主键IndexRequest indexRequest = new IndexRequest().id(docId).source(dataMap);//构建IndexRequest对象bulkRequest.add(indexRequest);//添加IndexRequest}bulkRequest.timeout(TimeValue.timeValueSeconds(5));//设置超时时间try {BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);//执行批量写入if (bulkResponse.hasFailures()) {//判断执行状态System.out.println("bulk fail,message:" + bulkResponse.buildFailureMessage());}} catch (IOException e) {e.printStackTrace();}}

3、单条更新

DSL语句

  #更新单条数据POST /nandao_scenic/_update/001{"doc": {"title":"海洋馆","city":"河南","price":"100.8","tag":["有车位","好玩"]}}#如果存在就更新,否则插入数据POST /nandao_scenic/_update/001{"doc": {"title":"海洋馆","city":"河南","price":"100.8","tag":["有车位","好玩"]},"upsert": {"title":"海洋馆","city":"河南","price":"100.8","tag":["有车位","好玩"]}}

java客户端语句

    //单条updatepublic void singleUpdate(String indexName, String docIdKey, Map<String, Object> recordMap) {UpdateRequest updateRequest = new UpdateRequest(indexName, docIdKey);updateRequest.doc(recordMap);try {UpdateResponse updateResponse=client.update(updateRequest, RequestOptions.DEFAULT);String index = updateResponse.getIndex();//通过IndexResponse获取索引名称String id = updateResponse.getId();//通过IndexResponse获取文档IdLong version = updateResponse.getVersion();//通过IndexResponse获取文档版本System.out.println("index=" + index + ",id=" + id + ",version="+version);} catch (IOException e) {e.printStackTrace();}}//单条upsert文档public void singleUpsert(String index, String docIdKey, Map<String, Object> recordMap,Map<String, Object> upRecordMap) {UpdateRequest updateRequest = new UpdateRequest(index, docIdKey);//构建UpdateRequestupdateRequest.doc(recordMap);//设置写更新辑updateRequest.upsert(upRecordMap);//设置插入逻辑try {client.update(updateRequest, RequestOptions.DEFAULT);//执行upsert} catch (IOException e) {e.printStackTrace();}}

4、批量更新文档

DSL语句

 POST /_bulk
{"update":{"_index":"nandao_scenic","_id":"7"}}
{"doc":{"title":"少林寺1","city":"河南","price":"88.8","tag":["有位","人少"]}}
{"update":{ "_index":"nandao_scenic", "_id":"8"}}{"doc":{"title":"动物园1","city":"河南","price":"80.8","tag":["有位","人少"]}}

java语句

public void bulkUpsert(String indexName, String docIdKey, List<Map<String, Object>> recordMapList) {BulkRequest bulkRequest = new BulkRequest();//新建请求//遍历所有的提示词和提示词对应的拼音形式数据for (Map<String, Object> dataMap : recordMapList) {String docId = dataMap.get(docIdKey).toString();//获取主键作为Elasticsearch索引的主键UpdateRequest updateRequest = new UpdateRequest(indexName, docId).doc(dataMap, XContentType.JSON).upsert(dataMap, XContentType.JSON);//构建写入请求bulkRequest.add(updateRequest);//批量加入写入请求}try {BulkResponse bulkResponse = client.bulk(bulkRequest, RequestOptions.DEFAULT);//获取批量写入的返回结果if (bulkResponse.hasFailures()) {//判断是否写入失败System.out.println("bulk fail,message:" + bulkResponse.buildFailureMessage());}} catch (Exception e) {e.printStackTrace();}}

5、根据条件更新文档

DSL语句

 POST /nandao_scenic/_update_by_query{"query": {"term": {"city": {"value": "北京"}}},"script": {"source": "ctx._source['city']='北京市'","lang": "painless"}}

java语句

  public void updateCityByQuery(String index,String oldCity,String newCity) {UpdateByQueryRequest updateByQueryRequest=new UpdateByQueryRequest(index);//构建UpdateByQueryRequest对象updateByQueryRequest.setQuery(new TermQueryBuilder("city",oldCity));//设置按照城市查找文档的queryupdateByQueryRequest.setScript(new Script("ctx._source['city']='"+newCity+"';"));//设置更新城市字段的脚本逻辑try {client.updateByQuery(updateByQueryRequest,RequestOptions.DEFAULT);//执行更新} catch (IOException e) {e.printStackTrace();}}

6、删除单条

 DELETE /nandao_scenic/_doc/001

java语言

  public void singleDelete(String index, String docId) {DeleteRequest deleteRequest=new DeleteRequest(index,docId);//构建删除请求try {client.delete(deleteRequest, RequestOptions.DEFAULT);//执行删除} catch (IOException e) {e.printStackTrace();}}

7、批量删除

  POST /_bulk
{"delete":{"_index":"nandao_scenic","_id":"7"}}
{"delete":{ "_index":"nandao_scenic", "_id":"8"}}

java客户端语言

public void bulkDelete(String index, String docIdKey, List<String> docIdList) {BulkRequest bulkRequest = new BulkRequest();//构建BulkRequest对象for (String docId : docIdList) {//遍历文档_d列表DeleteRequest deleteRequest=new DeleteRequest(index,docId);//构建删除请求bulkRequest.add(deleteRequest);//创建UpdateRequest对象}try {BulkResponse bulkResponse = client.bulk(bulkRequest,RequestOptions.DEFAULT);//执行批量删除if (bulkResponse.hasFailures()) {//判断状态System.out.println("bulk fail,message:" + bulkResponse.buildFailureMessage());}} catch (Exception e) {e.printStackTrace();}}

8、根据条件删除

 POST /nandao_scenic/_delete_by_query{"query": {"term": {"city": {"value": "北京"}}}}

java语言客户端

public void deleteByQuery(String index,String city) {DeleteByQueryRequest deleteByQueryRequest=new DeleteByQueryRequest(index);//构建DeleteByQueryRequest对象deleteByQueryRequest.setQuery(new TermQueryBuilder("city",city));//设置按照城市查找文档的querytry {client.deleteByQuery(deleteByQueryRequest,RequestOptions.DEFAULT);//执行删除} catch (IOException e) {e.printStackTrace();}}

 到此、es基本操作分享完毕,下篇我们分享es丰富的搜索功能,敬请期待!


http://www.ppmy.cn/news/8637.html

相关文章

密码技术扫盲,Part 3:认证

个人博客 密码技术扫盲&#xff0c;Part 1&#xff1a;对称加密密码技术扫盲&#xff0c;Part 2&#xff1a;非对称加密&#x1f3af; 密码技术扫盲&#xff0c;Part 3&#xff1a;认证 除了加密&#xff0c;还有一类用法是对信息的认证&#xff0c;主要包括 4 个技术 单向散…

如何在星巴克连接家中Windows台式机?(安卓,iOS, Windows, macOS配合frp公网iP实现)...

zhaoolee 最近热衷于和海外热心老哥们交换硬盘中的单机游戏资源(BT下载)&#xff0c;家中有Windows台式机&#xff0c; 适合长时间挂机下载BT资源&#xff0c;zhaoolee希望能随时连接到Windows台式机新增下载任务&#xff0c;安装体积超大的主机游戏。 另外&#xff0c;公司有一…

Maven 之 依赖管理

目录 1、依赖传递 小案例&#xff1a; 2、可选依赖 3、 排除依赖 4、可选依赖和排除依赖的区别 我们开发一个工程需要用到大量得jar包&#xff0c;而这些jar 包就是我们所说得依赖&#xff0c;一个项目可以配置多个依赖。 1、依赖传递 我们来看一下今天用来演示的工程。…

微信小程序介绍

目录 1.什么是小程序&#xff1f; 2.小程序可以干什么&#xff1f; 2&#xff0c;1.相关资料 2.2.申请微信小程序测试账号 3. 开发一个demo 3.1 创建项目 3.2 配置 3.3 常用框架 3.4 目录结构说明 目录结构 小程序代码构成 JSON 配置 小程序配置 app.json 工具配…

Linux权限及其理解

文章目录&#xff1a;Linux权限的概念Linux权限管理文件访问者的分类&#xff08;人&#xff09;文件类型和访问权限&#xff08;事物属性&#xff09;文件权限值的表示方法文件访问权限的设置方法权限掩码目录的权限粘滞位总结Linux权限的概念 与其它系统相比&#xff0c;Lin…

ZC706P+ADRV9009连接RADIOVERSE详解之三

做好SD卡映像&#xff0c;连接好硬件之后&#xff0c;我们就可以尝试软件操作了。 步骤1&#xff1a;设置好网络 打开软件界面我们看到&#xff0c;板子默认的地址为192.168.1.10 端口号为55555.我们一定也设置跟板子连接的以太网口处于192.168.1网段&#xff0c;并且子网掩码…

机器学习--数据清理、数据变换、特征工程

目录 一、数据清理 二、数据变换 三、特征工程 四、总结 一、数据清理 数据清理是提升数据的质量的一种方式。 数据不干净&#xff08;噪声多&#xff09;&#xff1f; 需要做数据的清理&#xff0c;将错误的信息纠正过来&#xff1b; 数据比较干净&#xff08;数据不是…

【北京理工大学-Python 数据分析-1.1】

数据维度 维度&#xff1a;一组数据的组织形式 一维数据&#xff1a;由对等关系的有序或无序数据构成&#xff0c;采用线性组织形式。包括列表、集合和数组&#xff08;python中不常见&#xff0c;但在C和Java中比较常见&#xff09;类型。 列表&#xff1a;数据类型可以不同…