2020-05-18

Elasticsearch - async search

Elasticsearch - async search

Asynchronous search

Asynchronous search makes long-running queries feasible and reliable. Async search allows users to run long-running queries in the background, track the query progress, and retrieve partial results as they become available. Async search enables users to more easily search vast amounts of data with no more pesky timeouts.

Submit async search API

Executes a search request asynchronously. It accepts the same parameters and request body as the search API.

POST /sales*/_async_search?size=0
{
    "sort" : [
      { "date" : {"order" : "asc"} }
    ],
    "aggs" : {
        "sale_date" : {
             "date_histogram" : {
                 "field" : "date",
                 "calendar_interval": "1d"
             }
         }
    }
}

The response contains an identifier of the search being executed. You can use this ID to later retrieve the search’s final results. The currently available search results are returned as part of the response object.

{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_partial" : true,
  "is_running" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "response" : {
    "took" : 1122,
    "timed_out" : false,
    "num_reduce_phases" : 0,
    "_shards" : {
      "total" : 562,
      "successful" : 3,
      "skipped" : 0,
      "failed" : 0
    },
    "hits" : {
      "total" : {
        "value" : 157483,
        "relation" : "gte"
      },
      "max_score" : null,
      "hits" : [ ]
    }
  }
}

Identifier of the async search that can be used to monitor its progress, retrieve its results, and/or delete it


When the query is no longer running, indicates whether the search failed or was successfully completed on all shards. While the query is being executed, is_partial is always set to true


Whether the search is still being executed or it has completed


How many shards the search will be executed on, overall


How many shards have successfully completed the search


How many documents are currently matching the query, which belong to the shards that have already completed the search

The get async search API retrieves the results of a previously submitted async search request given its id. If the Elasticsearch security features are enabled. the access to the results of a specific async search is restricted to the user that submitted it in the first place.

GET /_async_search/FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=


{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_partial" : true,
  "is_running" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "response" : {
    "took" : 12144,
    "timed_out" : false,
    "num_reduce_phases" : 46,
    "_shards" : {
      "total" : 562,
      "successful" : 188,
      "skipped" : 0,
      "failed" : 0
    },
    "hits" : {
      "total" : {
        "value" : 456433,
        "relation" : "eq"
      },
      "max_score" : null,
      "hits" : [ ]
    },
    "aggregations" : {
      "sale_date" :  {
        "buckets" : []
      }
    }
  }
}

When the query is no longer running, indicates whether the search failed or was successfully completed on all shards. While the query is being executed, is_partial is always set to true


Whether the search is still being executed or it has completed


When the async search will expire


Indicates how many reductions of the results have been performed. If this number increases compared to the last retrieved results, you can expect additional results included in the search response


Indicates how many shards have executed the query. Note that in order for shard results to be included in the search response, they need to be reduced first.


Partial aggregations results, coming from the shards that have already completed the execution of the query.

The wait_for_completion_timeout parameter can also be provided when calling the Get Async Search API, in order to wait for the search to be completed up until the provided timeout. Final results will be returned if available before the timeout expires, otherwise the currently available results will be returned once the timeout expires. By default no timeout is set meaning that the currently available results will be returned without any additional wait.

The keep_alive parameter specifies how long the async search should be available in the cluster. When not specified, the keep_alive set with the corresponding submit async request will be used. Otherwise, it is possible to override such value and extend the validity of the request. When this period expires, the search, if still running, is cancelled. If the search is completed, its saved results are deleted.

Delete async searchedit
You can use the delete async search API to manually delete an async search by ID. If the search is still running, the search request will be cancelled. Otherwise, the saved search results are deleted.

DELETE /_async_search/FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=

No comments:

Post a Comment