ElasticSearch学习之文档API相关操作

 更新时间:2023年01月31日 14:30:39   作者:程序员皮卡秋  
这篇文章主要为大家介绍了ElasticSearch学习之文档API相关操作,有需要的朋友可以借鉴参考下,希望能够有所帮助,祝大家多多进步,早日升职加薪

前言

本节主要给大家讲一下文档API相关操作。在学习之前,建议大家先回顾前几节内容,让自己有一个整体的认知,不要把概念混淆了。我们前几节都在讲索引,它是和文档挂钩的,文档我们可以理解为数据,数据有增删改查,本节就主要跟大家讲一下文档的增删改查操作。

本文偏实战一些,好了, 废话不多说直接开整吧~

创建文档

这里沿用之前的例子,使用class_1的索引,我们先看下它的索引结构:

GET /class_1

返回:

{
  "class_1" : {
    "aliases" : {
      "class" : { }
    },
    "mappings" : {
      "properties" : {
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "num" : {
          "type" : "long"
        }
      }
    },
    "settings" : {
      "index" : {
        "refresh_interval" : "3s",
        "number_of_shards" : "3",
        "provided_name" : "class_1",
        "creation_date" : "1670812583980",
        "number_of_replicas" : "1",
        "uuid" : "CTD3dM-fQm-KFEVl4nAgRQ",
        "version" : {
          "created" : "7060299"
        }
      }
    }
  }
}

通过结构可以看到,它主要有两个字段namenum,那么我们怎么往里边添加数据呢?

创建文档分为以下几种情况:

  • 创建单个数据指定ID:使用_doc路由+PUT请求+id参数
  • 创建单个数据不指定ID:使用_doc路由+POST请求
  • 创建单个数据指定ID并进行ID唯一性控制:使用_doc路由+PUT请求+id参数+op_type=create参数
  • 创建批量数据指定ID:使用_bulk路由+PUT请求/POST请求+create关键字+_id属性
  • 创建批量数据不指定ID:使用_bulk路由+PUT请求/POST请求+create关键字

下面,带大家一个一个看

单个数据

指定ID

PUT /class_1/_doc/1
{
  "name":"a",
  "num": 5
}

创建成功返回:

{
  "_index" : "class_1",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 3
}

不指定ID

PUT /class_1/_doc/
{
  "name":"b",
  "num": 6
}
{
  "error" : "Incorrect HTTP method for uri [/class_1/_doc/?pretty=true] and method [PUT], allowed: [POST]",
  "status" : 405
}

创建失败了,告诉我们这里要使用POST

POST /class_1/_doc/
{
  "name":"b",
  "num": 6
}
{
  "_index" : "class_1",
  "_type" : "_doc",
  "_id" : "h2Fg-4UBECmbBdQA6VLg",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 3
}

可以看到不指定id情况下,创建的文档_id随机生成了

ID唯一性控制

PUT /class_1/_doc/1?op_type=create
{
  "name":"c",
  "num": 7
}
{
  "error" : {
    "root_cause" : [
      {
        "type" : "version_conflict_engine_exception",
        "reason" : "[1]: version conflict, document already exists (current version [1])",
        "index_uuid" : "CTD3dM-fQm-KFEVl4nAgRQ",
        "shard" : "2",
        "index" : "class_1"
      }
    ],
    "type" : "version_conflict_engine_exception",
    "reason" : "[1]: version conflict, document already exists (current version [1])",
    "index_uuid" : "CTD3dM-fQm-KFEVl4nAgRQ",
    "shard" : "2",
    "index" : "class_1"
  },
  "status" : 409
}

可以看到,创建失败了,返回document already exists

批量数据

指定ID

PUT class_1/_bulk
{ "create":{ "_id": 2 } }
{"name":"d","num": 8}
{ "create":{ "_id": 3 } }
{ "name":"e","num": 9}
{ "create":{ "_id": 4 } }
{"name":"f","num": 10}

tip: 这里要注意,不能有空行,json对象{}需要在同一行

{
  "took" : 25,
  "errors" : false,
  "items" : [
    {
      "create" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 4,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 4,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "4",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 4,
        "status" : 201
      }
    }
  ]
}

不指定ID

很简单,去掉id属性就好了~

PUT class_1/_bulk
{ "create":{  } }
{"name":"g","num": 8}
{ "create":{  } }
{ "name":"h","num": 9}
{ "create":{  } }
{"name":"i","num": 10}
{
  "took" : 30,
  "errors" : false,
  "items" : [
    {
      "create" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iGFt-4UBECmbBdQAnVJe",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 3,
        "_primary_term" : 4,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iWFt-4UBECmbBdQAnVJg",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 4,
        "_primary_term" : 4,
        "status" : 201
      }
    },
    {
      "create" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "imFt-4UBECmbBdQAnVJg",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 5,
        "_primary_term" : 4,
        "status" : 201
      }
    }
  ]
}

修改文档

文档修改分以下几种情况:

  • 按照ID全量更新单个数据:使用_doc路由+PUT请求+id参数
  • 按照ID全量更新单个数据并进行乐观锁控制:使用_doc路由+PUT请求+if_seq_no&if_primary_term参数+id参数
  • 按照ID部分更新单个数据(包含属性添加):使用_update路由+POST请求+id参数
  • 按照ID全量更新批量数据:使用_bulk路由+PUT请求/POST请求+index关键字+_id属性
  • 按照ID部分更新批量数据(包含属性添加):使用_bulk路由+PUT请求/POST请求+update关键字+_id属性
  • 按照条件修改数据:使用_update_by_query路由+POST请求+ctx._source[字段名称]=字段值
  • 按照条件给数据新增属性:使用_update_by_query路由+POST请求+ctx._source[字段名称]=字段值
  • 按照条件给数据移除属性:使用_update_by_query路由+POST请求+ctx._source.remove(字段名称)

同样的,带大家一个个来看~

按照ID单个

全量更新

PUT /class_1/_doc/1
{
  "name":"k",
  "num": 5
}
{
  "_index" : "class_1",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 2,
  "_primary_term" : 3
}

再次修改:

PUT /class_1/_doc/1
{
  "name":"k",
  "num": 6
}
{
  "_index" : "class_1",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 3,
  "_primary_term" : 3
}

大家观察一下这个_version字段,发现它的版本号是递增的,也就是说会随着我们的修改而变化

基于乐观锁全量更新

跟上条件if_seq_no,if_primary_term

PUT /class_1/_doc/1?if_seq_no=3&if_primary_term=3
{
  "name":"l",
  "num": 6
}
{
  "_index" : "class_1",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 4,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 4,
  "_primary_term" : 3
}

再操作一下:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "version_conflict_engine_exception",
        "reason" : "[1]: version conflict, required seqNo [3], primary term [3]. current document has seqNo [4] and primary term [3]",
        "index_uuid" : "CTD3dM-fQm-KFEVl4nAgRQ",
        "shard" : "2",
        "index" : "class_1"
      }
    ],
    "type" : "version_conflict_engine_exception",
    "reason" : "[1]: version conflict, required seqNo [3], primary term [3]. current document has seqNo [4] and primary term [3]",
    "index_uuid" : "CTD3dM-fQm-KFEVl4nAgRQ",
    "shard" : "2",
    "index" : "class_1"
  },
  "status" : 409
}

发现操作失败了,因为条件不符合 required seqNo [3], primary term [3],上一步操作完之后seqNo和primary term [4]

部分更新

PUT /class_1/_update/1
{
  "doc":{
      "name":"m",
      "num": 1
  }
}

按照ID批量

全量更新

PUT class_1/_bulk
{ "create":{ "_id": 2 } }
{"name":"d","num": 8}
{ "create":{ "_id": 3 } }
{ "name":"e","num": 9}
{ "create":{ "_id": 4 } }
{"name":"f","num": 10}

这个应该好理解

部分更新

需要修改为update并添加属性

PUT class_1/_bulk
{ "update":{ "_id": 2 } }
{ "doc":{"name":"d","num": 8}}
{ "update":{ "_id": 3 } }
{ "doc":{ "name":"e","num": 9}}
{ "update":{ "_id": 4 } }
{ "doc":{"name":"f","num": 10}}

返回:

{
  "took" : 32,
  "errors" : false,
  "items" : [
    {
      "update" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 6,
        "_primary_term" : 4,
        "status" : 200
      }
    },
    {
      "update" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 7,
        "_primary_term" : 4,
        "status" : 200
      }
    },
    {
      "update" : {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "4",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 2,
          "failed" : 0
        },
        "_seq_no" : 8,
        "_primary_term" : 4,
        "status" : 200
      }
    }
  ]
}

按照条件修改

修改字段

查找name=d的数据修改为num=10,name=e

POST class_1/_update_by_query
{
"query": {
  "match": {
    "name": "d"
  }
},
"script": {
  "source": "ctx._source['num']='10';ctx._source['name']='e'",
  "lang": "painless"
}
}

返回:

{
  "took" : 108,
  "timed_out" : false,
  "total" : 1,
  "updated" : 1,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

增加字段

POST class_1/_update_by_query
{
"query": {
  "match": {
    "name": "e"
  }
},
"script": {
  "source": "ctx._source['desc']=['hhhh']",
  "lang": "painless"
}
}
{
  "took" : 344,
  "timed_out" : false,
  "total" : 2,
  "updated" : 2,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

接着我们查下class_1的索引结构:

{
  "class_1" : {
    "aliases" : {
      "class" : { }
    },
    "mappings" : {
      "properties" : {
        "age" : {
          "type" : "long"
        },
        "desc" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "num" : {
          "type" : "long"
        }
      }
    },
    "settings" : {
      "index" : {
        "refresh_interval" : "3s",
        "number_of_shards" : "3",
        "provided_name" : "class_1",
        "creation_date" : "1670812583980",
        "number_of_replicas" : "1",
        "uuid" : "CTD3dM-fQm-KFEVl4nAgRQ",
        "version" : {
          "created" : "7060299"
        }
      }
    }
  }
}

可以看到多了一个字段:

{
"desc" : {
    "type" : "text",
    "fields" : {
    "keyword" : {
        "type" : "keyword",
        "ignore_above" : 256
    }
    }
}
}

移除字段

POST class_1/_update_by_query
{
"query": {
  "match": {
    "name": "e"
  }
},
"script": {
  "source": "ctx._source.remove('desc')",
  "lang": "painless"
}
}

大家可以试着运行一下,然后再查下索引

删除文档

按照ID & 单个删除

DELETE /class_1/_doc/2
{
  "_index" : "class_1",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 5,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 12,
  "_primary_term" : 4
}

按照ID & 批量删除

PUT class_1/_bulk
{ "delete":{"_id":"2" } }
{ "delete":{"_id":"3" } }

按照条件删除

POST class_1/_delete_by_query
{
"query":{
  "match_all":{
    "name": "e"
  }
}
}

结束语

本节主要讲了ES中的文档API操作,还遗留一个查询操作, 该部分内容较多,放到后边给大家讲,更多关于ElasticSearch文档API操作的资料请关注脚本之家其它相关文章!

相关文章

  • Java行为型模式中命令模式分析

    Java行为型模式中命令模式分析

    在软件设计中,我们经常需要向某些对象发送请求,但是并不知道请求的接收者是谁,也不知道被请求的操作是哪个,我们只需在程序运行时指定具体的请求接收者即可,此时可以使用命令模式来进行设计
    2023-02-02
  • Spring配置使用之Bean生命周期详解

    Spring配置使用之Bean生命周期详解

    这篇文章主要介绍了Spring配置使用之Bean生命周期详解,具有一定参考价值,需要的朋友可以了解下。
    2017-10-10
  • 利用Java读取二进制文件实例详解

    利用Java读取二进制文件实例详解

    这篇文章主要给大家介绍了利用Java读取二进制文件的相关资料,文中通过示例代码介绍的非常详细,对大家的学习或者使用java具有一定的参考学习价值,需要的朋友们下面跟着小编来一起学习学习吧。
    2017-08-08
  • 完美解决gson将Integer默认转换成Double的问题

    完美解决gson将Integer默认转换成Double的问题

    下面小编就为大家带来一篇完美解决gson将Integer默认转换成Double的问题。小编觉得挺不错的,现在就分享给大家,也给大家做个参考。一起跟随小编过来看看吧
    2017-03-03
  • 详解lombok @Getter @Setter 使用注意事项

    详解lombok @Getter @Setter 使用注意事项

    这篇文章主要介绍了详解lombok @Getter @Setter 使用注意事项,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧
    2020-11-11
  • Java基础之java泛型通配符详解

    Java基础之java泛型通配符详解

    Java 泛型(generics)是 JDK 5 中引入的一个新特性, 泛型提供了编译时类型安全检测机制,该机制允许开发者在编译时检测到非法的类型,今天通过本文给大家介绍java泛型通配符的相关知识,感兴趣的朋友一起看看吧
    2021-07-07
  • @NotEmpty、@NotBlank、@NotNull的区别

    @NotEmpty、@NotBlank、@NotNull的区别

    这篇文章主要介绍了@NotEmpty、@NotBlank、@NotNull的区别,需要的朋友可以参考下
    2016-09-09
  • Java中Session的详解

    Java中Session的详解

    这篇文章主要介绍了了解java中的session的相关问题,什么是session,session怎么用等,具有一定参考价值,需要的朋友可以了解下。
    2021-10-10
  • 详解spring cloud hystrix请求缓存(request cache)

    详解spring cloud hystrix请求缓存(request cache)

    这篇文章主要介绍了详解spring cloud hystrix请求缓存(request cache),小编觉得挺不错的,现在分享给大家,也给大家做个参考。一起跟随小编过来看看吧
    2018-05-05
  • 聊聊Java并发中的Synchronized

    聊聊Java并发中的Synchronized

    这篇文章主要介绍了聊聊Java并发中的Synchronized,介绍了同步的基础,原理,锁的相关内容,具有一定参考价值,需要的朋友可以了解下。
    2017-11-11

最新评论