2019-11-26

Elasticsearch - Documents

Elasticsearch is document-oriented, meaning the smallest unit
of data you index or search for is a document. A document has a few important prop-
erties in Elasticsearch:
■ It’s self-contained. A document contains both the fields (name) and their values
(Elasticsearch Denver).
■ It can be hierarchical. Think of this as documents within documents. A value of a
field can be simple, like the value of the location field can be a string. It can also
contain other fields and values. For example, the location field might contain
both a city and a street address within it.
■ It has a flexible structure. Your documents don’t depend on a predefined schema.
For example, not all events need description values, so that field can be omitted
altogether. But it might require new fields, such as the latitude and longitude of
the location.
A document is normally a JSON representation of your data. As we discussed in chap-
ter 1, JSON over HTTP is the most widely used way to communicate with Elasticsearch,
and it’s the method we use throughout the book. For example, an event in your get-
together site can be represented in the following document:
{
 "name": "Elasticsearch Denver",
 "organizer": "Lee",
 "location": "Denver, Colorado, USA"
}
NOTE Throughout the book, we’ll use different colors for the field names
and values of the JSON documents to make them easier to read. Field names
are darker/blue, and values are in lighter/red.
You can also imagine a table with three columns: name, organizer, and location. The
document would be a row containing the values. But there are some differences that
make this comparison inexact. One difference is that, unlike rows, documents can be
hierarchical. For example, the location can contain a name and a geolocation:
{
 "name": "Elasticsearch Denver",
 "organizer": "Lee",
 "location": {
 "name": "Denver, Colorado, USA",
 "geolocation": "39.7392, -104.9847"
 }
}


A single document can also contain arrays of values; for example:
{
 "name": "Elasticsearch Denver",
 "organizer": "Lee",
 "members": ["Lee", "Mike"]
}
Documents in Elasticsearch are said to be schema-free, in the sense that not all your doc-
uments need to have the same fields, so they’re not bound to the same schema. For
example, you could omit the location altogether in case the organizer needs to be
called before every gathering:
{
 "name": "Elasticsearch Denver",
 "organizer": "Lee",
 "members": ["Lee", "Mike"]
}
Although you can add or omit fields at will, the type of each field matters: some are
strings, some are integers, and so on. Because of that, Elasticsearch keeps a mapping
of all your fields and their types and other settings. This mapping is specific to every
type of every index. That’s why types are sometime called mapping types in Elastic-
search terminology.

No comments:

Post a Comment