2020-04-25

ELK tutorial

What is ELK?
ELK is a combination of 3 open source products:


  • Elasticsearch
  • Logstash
  • Kibana

All developed and maintained by Elastic .

Elasticsearch is a NoSQL database based on the Lucene search engine. Logstash is a log pipeline tool that accepts data input, performs data conversion, and then outputs data. Kibana is an interface layer that works on top of Elasticsearch. In addition, the ELK stack also contains a series of log collector tools called Beats.

The most common usage scenario of ELK is as a log system for Internet products. Of course, the ELK stack can also be used in other aspects, such as: business intelligence, big data analysis, etc.

Why use ELK?
The ELK stack is very popular, because it is powerful, open source and free. For smaller companies such as SaaS companies and startups, using ELK to build a log system is very cost-effective.

Netflix, Facebook, Microsoft, LinkedIn and Cisco also use ELK to monitor logs.

Why use a logging system?
The log system records all aspects of system operation and business processing, and plays an increasingly important role in troubleshooting, business analysis, data mining, and big data analysis.

ELK architecture
The various components in the ELK stack cooperate with each other and do not require much additional configuration. Of course, the architecture will be different for different use scenarios.

For small development environments, the architecture is usually as follows:


For a production environment with a large amount of data, other parts may be added to the log architecture, for example: to improve elasticity (add Kafka, RabbitMQ, Redis) and security (add nginx):



ELK install Elasticsearch

The ELK stack should install the following open source components:


  • Elasticsearch
  • Kibana
  • Beats
  • Logstash (optional) Logstash is optional.


Install Elasticsearch

Elasticsearch is a near real-time full-text search engine, which has multiple uses, such as a log system.

To download and install Elasticsearch, open a command line terminal and execute the following commands (deb for Debian / Ubuntu, rpm for Redhat / Centos / Fedora, mac for OS X, linux for linux, win for Windows):

deb:

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.1.0-amd64.deb
sudo dpkg -i elasticsearch-7.1.0-amd64.deb
sudo /etc/init.d/elasticsearch start

rpm:

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.1.0-x86_64.rpm
sudo rpm -i elasticsearch-7.1.0-x86_64.rpm
sudo service elasticsearch start

mac:

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.1.0-darwin-x86_64.tar.gz
tar -xzvf elasticsearch-7.1.0-darwin-x86_64.tar.gz
cd elasticsearch-7.1.0
./bin/elasticsearch

linux:

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.1.0-linux-x86_64.tar.gz
tar -xzvf elasticsearch-7.1.0-linux-x86_64.tar.gz
cd elasticsearch-7.1.0
./bin/elasticsearch

win:

Download the Elasticsearch 7.1.0 Windows zip file from the Elasticsearch download page.
Extract the contents of the zip file to a directory, for example: C: \ Program Files.
Open a command line window as an administrator and switch to the unzipped directory, for example:
cd C:\Program Files\elasticsearch-7.1.0

Start Elasticsearch:
shell
bin\elasticsearch.bat
Confirm Elasticsearch start
To confirm whether the Elasticsearch service is started, you can access port 9200.

curl http://127.0.0.1:9200

On Windows, if cURL is not installed, you can open the above URL with a browser.

If everything is normal, you can see the following response:

{
  "name" : "qikegu",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "qZk2EjpQRDiYYyhccomWyw",
  "version" : {
    "number" : "7.1.0",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "606a173",
    "build_date" : "2019-05-16T00:43:15.323135Z",
    "build_snapshot" : false,
    "lucene_version" : "8.0.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}


ELK install Kibana


Kibana is an interface component used with Elasticsearch. You can use Kibana to search and view the data in Elasticsearch. You can use Kibana to easily perform various complex data analysis and display the data in various charts and tables.

It is recommended to install Kibana and Elasticsearch on the same server, but this is not required. If it is installed on a different server, you need to kibana.ymlmodify the Elasticsearch server URL (IP: PORT) in the Kibana configuration file .

To download and install Kibana, open a command line window and execute the following command:

deb, rpm, or linux:

curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-7.1.0-linux-x86_64.tar.gz
tar xzvf kibana-7.1.0-linux-x86_64.tar.gz
cd kibana-7.1.0-linux-x86_64/
./bin/kibana

mac:

curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-7.1.0-darwin-x86_64.tar.gz
tar xzvf kibana-7.1.0-darwin-x86_64.tar.gz
cd kibana-7.1.0-darwin-x86_64/
./bin/kibana

win:

Download the Kibana 7.1.0 Windows zip file from the Kibana download page .
Extract the contents of the zip file to a directory, for example C:\Program Files.
Open a command line window as an administrator and switch to the unzipped directory, for example:
cd C:\Program Files\kibana-7.1.0-windows

Start Kibana:
bin\kibana.bat

To learn more about the installation, configuration and operation Kibana more information, please refer to the official website .

Enter Kibana interface

The port of Kibana is 5601. To enter the Kibana web interface, use a browser to access the Kibana website, for example: http://127.0.0.1:5601.

If you cannot connect from the outside, you can usually do the following:

In the configuration file kibana.yml, server.hostmodify the configuration items to:, server.host: "0.0.0.0"indicating that remote connection is allowed. The default binding localhostmeans that only local connections are allowed.
In the firewall, open port 5601. The following example is under CentOS, the firewall opens port 5601:
    [root@qikegu ~]# firewall-cmd --permanent --add-port=5601/tcp
    success
    [root@qikegu ~]# firewall-cmd --reload
    success
    [root@qikegu ~]# firewall-cmd --list-ports
    5601/tcp


ELK install Beat
 

Beat is a data collection tool that is installed on the server and sends the collected data to Elasticsearch. Beat can directly send data to Elasticsearch, or it can be sent to Logstash first, and then processed by Logstash before being sent to Elasticsearch.

Each Beat is an independently installable product. This tutorial will learn how to install and run Metricbeat, and how to enable the Metricbeat system module to collect system metrics.

To learn more about Beat, please refer to the relevant documents:

Beat type Crawl
Auditbeat Audit data
Filebeat Log file
Functionbeat Cloud data
Heartbeat Availability monitoring
Journalbeat Systemd journals
Metricbeat Operating indicators, such as system operating indicators
Packetbeat Network traffic
Winlogbeat Windows Event Log
Install Metricbeat
To download and install Metricbeat, open a command line window and execute the following command:

deb:

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.1.0-amd64.deb
sudo dpkg -i metricbeat-7.1.0-amd64.deb

rpm:

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.1.0-x86_64.rpm
sudo rpm -vi metricbeat-7.1.0-x86_64.rpm

mac:

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.1.0-darwin-x86_64.tar.gz
tar xzvf metricbeat-7.1.0-darwin-x86_64.tar.gz

linux:

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.1.0-linux-x86_64.tar.gz
tar xzvf metricbeat-7.1.0-linux-x86_64.tar.gz

win:

Download the Metricbeat Windows zip file from the Metricbeat download page .
Extract the contents of the zip file to C:\Program Files
Rename the Metricbeat-7.1.0-windows directory to Metricbeat.
Open the PowerShell command line as an administrator (right-click the PowerShell icon and select Run as Administrator).
On the PowerShell command line, run the following command to install Metricbeat as a Windows service:
    PS > cd 'C:\Program Files\Metricbeat'
    PS C:\Program Files\Metricbeat> .\install-service-metricbeat.ps1

Collect system running indicators and send to Elasticsearch
Metricbeat provides some preset monitoring modules, which can be deployed directly by turning on the switch.

This section will use systempreset modules, which can be used to collect system operation indicators, such as: CPU usage, memory, file system, disk IO and network IO statistics, and process statistics.

Before you start : Make sure that Elasticsearch and Kibana are running and Elasticsearch is ready to receive Metricbeat data.

Enable the systemmodule and start collecting system indicators:

From the Metricbeat installation directory, enable the system module:
deb and rpm:

sudo metricbeat modules enable system

mac and linux:

./metricbeat modules enable system

win:

PS C:\Program Files\Metricbeat> .\metricbeat.exe modules enable system

Set up the initial environment:
deb and rpm:

sudo metricbeat setup -e

mac and linux:

./metricbeat setup -e
copy
win:

PS C:\Program Files\Metricbeat> metricbeat.exe setup -e

Start Metricbeat:
deb and rpm:

sudo service metricbeat start

mac and linux:

./metricbeat -e

win:

PS C:\Program Files\Metricbeat> Start-Service metricbeat

Metricbeat starts and starts sending system data to Elasticsearch.

View system indicators in Kibana
The browser opens the URL: http: // <your URL>: 5601 / app / kibana # / dashboard / Metricbeat-system-overview-ecs

If you do not see the data in Kibana, please try to enlarge the time range. By default, Kibana displays the last 15 minutes. If you see an error, make sure Metricbeat is running, then refresh the page.


Click Host Overviewto view detailed indicators of the selected host.



So far, we have built a basic ELK architecture and successfully collected system information.


ELK installs Logstash


In ELK, Logstash is not required to be installed.

Logstash is a powerful tool that provides a large number of plug-ins for parsing and processing various data from data sources. If the data collected by Beat needs to be processed before it can be used, it is necessary to integrate Logstash.

To download and install Logstash, open a command line window and execute the following command:

Logstash relies on Java 8 or Java 11 to ensure that Java is installed.

[root@qikegu ~]# java --version
openjdk 11.0.3 2019-04-16 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.3+7-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.3+7-LTS, mixed mode, sharing)

deb:

curl -L -O https://artifacts.elastic.co/downloads/logstash/logstash-7.1.0.deb
sudo dpkg -i logstash-7.1.0.deb

rpm:

curl -L -O https://artifacts.elastic.co/downloads/logstash/logstash-7.1.0.rpm
sudo rpm -i logstash-7.1.0.rpm

mac and linux:

curl -L -O https://artifacts.elastic.co/downloads/logstash/logstash-7.1.0.tar.gz
tar -xzvf logstash-7.1.0.tar.gz

win:

Download the Logstash 7.1.0 Windows zip file from the Logstash download page .
Extract the contents of the zip file to a directory, for example C:\Program Files.
To learn more about installing, configuring, and running Logstash, please read the official website documentation .

Configure Logstash to listen for Beats input
Logstash provides an input plugin for accepting data input. In this tutorial, you will create a Logstash pipeline configuration to listen for Beat input and send the received data to Elasticsearch.

Configure Logstash
Create a new Logstash pipeline configuration file and name it demo-metrics-pipeline.conf. If you install Logstash as a deb or rpm package, /etc/logstash/conf.d/create a file in the Logstash configuration directory (for example:) .

The file must contain:

Enter the configuration and set the beat port to 5044
Output configuration, configure elasticsearch related information
Examples:

input {
  beats {
    port => 5044
  }
}

# The filter part of this file is commented out to indicate that it
# is optional.
# filter {
#
# }

output {
  elasticsearch {
    hosts => "localhost:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
}

When using this pipeline configuration to start Logstash, Beat events will be sent to Elasticsearch through Logstash. In Logstash, you can use Logstash's powerful functions to analyze and process data.

Start Logstash
Start Logstash. If you install Logstash as a deb or rpm package, make sure that the configuration file is in the configuration directory.

deb:

sudo /etc/init.d/logstash start

rpm:

sudo service logstash start

mac and linux:

./bin/logstash -f demo-metrics-pipeline.conf

win:

./bin/logstash -f demo-metrics-pipeline.conf

Logstash began to listen to events sent by Beat. Next, you need to configure Metricbeat to send events to Logstash.

Configure Metricbeat to send events to Logstash
By default, Metricbeat sends events to Elasticsearch.

To send events to Logstash, you need to modify the configuration file metricbeat.yml. This file can be found in the Metricbeat installation directory, or /etc/metricbeat(rpm and deb).

Comment out the output.elasticsearchpart and enable the output.logstashpart:

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
  #hosts: ["localhost:9200"]

...

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]

Save the file and restart Metricbeat for the configuration changes to take effect.

Define filters to extract data from fields
Currently, Logstash just forwards the event to Elasticsearch, without processing it. Next, you will learn to use filters.

The system data collected by Metricbeat includes a cmdlinefield named , which contains the complete command line parameters for the process start. E.g:

"cmdline": "/Applications/Firefox.app/Contents/MacOS/plugin-container.app/Contents/MacOS/plugin-container -childID 3
-isForBrowser -boolPrefs 36:1|299:0| -stringPrefs 285:38;{b77ae304-9f53-a248-8bd4-a243dbf2cab1}| -schedulerPrefs
0001,2 -greomni /Applications/Firefox.app/Contents/Resources/omni.ja -appomni
/Applications/Firefox.app/Contents/Resources/browser/omni.ja -appdir
/Applications/Firefox.app/Contents/Resources/browser -profile
/Users/dedemorton/Library/Application Support/Firefox/Profiles/mftvzeod.default-1468353066634
99468 gecko-crash-server-pipe.99468 org.mozilla.machname.1911848630 tab"
copy
You may only need the path of the command instead of sending the entire command line parameter to Elasticsearch. One way is to use Grok filters, learning Grok is beyond the scope of this tutorial, but if you want to learn more, please refer to the Grok filter plugin documentation .

To extract the path, in the Logstash configuration file created earlier, between the input and output sections, add the following Grok filter:

filter {
  if [system][process] {
    if [system][process][cmdline] {
      grok {
        match => {
          "[system][process][cmdline]" => "^%{PATH:[system][process][cmdline_path]}"
        }
        remove_field => "[system][process][cmdline]"
      }
    }
  }
}

Use a pattern to match the path, and then store the path in cmdline_patha field named .
The original field is deleted cmdline, so it is not indexed in Elasticsearch.
When complete, the complete configuration file should look like this:

input {
  beats {
    port => 5044
  }
}

filter {
  if [system][process] {
    if [system][process][cmdline] {
      grok {
        match => {
          "[system][process][cmdline]" => "^%{PATH:[system][process][cmdline_path]}"
        }
        remove_field => "[system][process][cmdline]"
      }
    }
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
    manage_template => false
    index => "qikegu-%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
}

Restart Logstash to make the configuration take effect. The event now contains a cmdline_pathfield named , containing the path of the command:

"cmdline_path": "/Applications/Firefox.app/Contents/MacOS/plugin-container.app/Contents/MacOS/plugin-container"

No comments:

Post a Comment