1. ELK简介
ELK是什么?
ELK是Elasticsearch Logstash Kibana三者的缩写,原来称为ELK Stack ,现在称为Elastic Stack,加入了beats来优化Logstash。ELK的主要用途是什么?
大型分布式系统的日志集中分析。为什么要做日志集中分析?
在生产系统中出现问题,我们通过查看日志定位问题,在大型的分布式系统中,若出现问题,你该如何查看日志?一个完整的集中式日志系统,需要包含以下几个主要特点
- 收集 : 能够采集多种来源的日志数据
- 传输、汇流 : 能够将日志分流、汇总,并传入中央存储
- 转换 : 能够对收集的日志数据进行转换处理
- 存储 : 如何存储日志数据
- 分析 : 可以支持UI分析
- 告警 : 能够提供错误报告,监控机制
ELK提供了一整套解决方案,并且都是开源软件,之间互相配合使用,完美衔接,高效的满足了很多场合的应用。目前主流的一种日志系统。
-
ELK架构(一) 老的架构
-
ELK架构(二) 用beats来进行采集的架构
2. Filebeat
-
Beats是什么?
轻量型数据采集器。负责从目标源上采集数据。
官网介绍:https://www.elastic.co/cn/products/beats
-
FileBeat 日志文件采集器工作原理
- filebeat的安装,以7.17.7为例
访问地址:https://www.elastic.co/cn/downloads/past-releases/filebeat-7-17-7
然后下载相应的版本,这里我下载的是7.17.7-linux_X86_64版本的,地址如下:
https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.17.7-linux-x86_64.tar.gz
上传到linux,然后解压
tar -zxvf filebeat-7.17.7-linux-x86_64.tar.gz
最后再,改一下文件夹名字,这里是因为我觉得太长了,如果觉得无所谓,可以忽略
mv filebeat-7.17.7-linux-x86_64 filebeat-7.17.7
- 这里用filebeat做一个测试,使用console输出结果
- 修改
filebeat.yml
,这里记得备份哦,备份是一个好习惯,前面好多我都忘了提醒了,但是这个应该刻在基因里边,不然出事了的话,要恢复的时候会很痛苦。
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
# filestream is an input for collecting log messages from files.
- type: filestream
# Unique ID among all inputs, an ID is required.
# 这里是默认的,如果觉得不好可以改成自己喜欢的,这里我没改,因为懒
id: my-filestream-id
# Change to true to enable this input configuration.
# 这里默认是 false ,一定要记得改成true,否则这个input的配置不起作用
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
# 这是个很重要的配置,主要是配置了要获取到的日志的位置在哪里,可以使用 * 号作为通配符
paths:
- /var/log/filebeat/*/*.log
#- c:\programdata\elasticsearch\logs\*
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
#exclude_lines: ['^DBG']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
#prospector.scanner.exclude_files: ['.gz$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
#
####################################################
# 这里的一堆内容,保持默认就好
####################################################
#
# ================================== Outputs ===================================
# Configure what output to use when sending the data collected by the beat.
# ---------------------------- Elasticsearch Output ----------------------------
# 默认是使用这个,先注释掉
#output.elasticsearch:
# Array of hosts to connect to.
# hosts: ["localhost:9200"]
# Protocol - either `http` (default) or `https`.
#protocol: "https"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
#username: "elastic"
#password: "changeme"
# ------------------------------ Logstash Output -------------------------------
#output.logstash:
# The Logstash hosts
#hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# ------------------------------ Console Output -------------------------------
# 这里是我们要加上去的内容
output.console:
pretty: true
然后保存退出即可
- 尝试启动filebeat
./filebeat -e
- 往配置的log路径
/var/log/filebeat/*.log
,中添加内容
cd /var/log/
mkdir filebeat
cd filebeat
echo "this is a test log+++++++++++++++" >> 01.log
- 运行结果
[elastic@lazyfennec filebeat-7.17.7]$ ./filebeat -e
2022-11-16T22:51:39.959+0800 INFO instance/beat.go:697 Home path: [/home/elastic/es7/filebeat-7.17.7] Config path: [/home/elastic/es7/filebeat-7.17.7] Data path: [/home/elastic/es7/filebeat-7.17.7/data] Logs path: [/home/elastic/es7/filebeat-7.17.7/logs] Hostfs Path: [/]
2022-11-16T22:51:39.960+0800 INFO instance/beat.go:705 Beat ID: 7be4fff8-9b20-48a8-9793-7aacbb903281
2022-11-16T22:51:42.976+0800 WARN [add_cloud_metadata] add_cloud_metadata/provider_aws_ec2.go:79 read token request for getting IMDSv2 token returns empty: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers). No token in the metadata request will be used.
2022-11-16T22:51:42.978+0800 INFO [seccomp] seccomp/seccomp.go:124 Syscall filter successfully installed
2022-11-16T22:51:42.982+0800 INFO [beat] instance/beat.go:1051 Beat info {"system_info": {"beat": {"path": {"config": "/home/elastic/es7/filebeat-7.17.7", "data": "/home/elastic/es7/filebeat-7.17.7/data", "home": "/home/elastic/es7/filebeat-7.17.7", "logs": "/home/elastic/es7/filebeat-7.17.7/logs"}, "type": "filebeat", "uuid": "7be4fff8-9b20-48a8-9793-7aacbb903281"}}}
2022-11-16T22:51:42.984+0800 INFO [beat] instance/beat.go:1060 Build info {"system_info": {"build": {"commit": "2b200bdbf5d85553b8f02c8709142b01dfd1082d", "libbeat": "7.17.7", "time": "2022-10-17T16:55:51.000Z", "version": "7.17.7"}}}
2022-11-16T22:51:42.985+0800 INFO [beat] instance/beat.go:1063 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":1,"version":"go1.18.5"}}}
2022-11-16T22:51:42.987+0800 INFO [beat] instance/beat.go:1067 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2022-11-16T13:07:01+08:00","containerized":false,"name":"lazyfennec","ip":["127.0.0.1/8","::1/128","192.168.1.9/24","2408:825c:6e2:b8b9:97ac:336e:7bb7:af46/64","fe80::3932:ba1d:da41:e31d/64"],"kernel_version":"5.14.10-300.fc35.x86_64","mac":["08:00:27:73:59:06"],"os":{"type":"linux","family":"redhat","platform":"fedora","name":"Fedora Linux","version":"35 (Workstation Edition)","major":35,"minor":0,"patch":0},"timezone":"CST","timezone_offset_sec":28800,"id":"2dd7538b08ab45fa97d32ced12db623b"}}}
2022-11-16T22:51:42.989+0800 INFO [beat] instance/beat.go:1096 Process info {"system_info": {"process": {"capabilities": {"inheritable":null,"permitted":null,"effective":null,"bounding":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read","38","39","40"],"ambient":null}, "cwd": "/home/elastic/es7/filebeat-7.17.7", "exe": "/home/elastic/es7/filebeat-7.17.7/filebeat", "name": "filebeat", "pid": 9332, "ppid": 7624, "seccomp": {"mode":"filter","no_new_privs":true}, "start_time": "2022-11-16T22:51:38.150+0800"}}}
2022-11-16T22:51:42.991+0800 INFO instance/beat.go:291 Setup Beat: filebeat; Version: 7.17.7
2022-11-16T22:51:42.992+0800 INFO [publisher] pipeline/module.go:113 Beat name: lazyfennec
2022-11-16T22:51:43.111+0800 WARN beater/filebeat.go:202 Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2022-11-16T22:51:43.132+0800 INFO [monitoring] log/log.go:142 Starting metrics logging every 30s
2022-11-16T22:51:43.133+0800 INFO instance/beat.go:456 filebeat start running.
2022-11-16T22:51:43.134+0800 INFO memlog/store.go:119 Loading data file of '/home/elastic/es7/filebeat-7.17.7/data/registry/filebeat' succeeded. Active transaction id=0
2022-11-16T22:51:43.135+0800 INFO memlog/store.go:124 Finished loading transaction log file for '/home/elastic/es7/filebeat-7.17.7/data/registry/filebeat'. Active transaction id=11
2022-11-16T22:51:43.137+0800 WARN beater/filebeat.go:411 Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2022-11-16T22:51:43.145+0800 INFO [registrar] registrar/registrar.go:109 States Loaded from registrar: 1
2022-11-16T22:51:43.150+0800 INFO [crawler] beater/crawler.go:71 Loading Inputs: 1
2022-11-16T22:51:43.151+0800 INFO [crawler] beater/crawler.go:117 starting input, keys present on the config: [filebeat.inputs.0.enabled filebeat.inputs.0.id filebeat.inputs.0.paths.0 filebeat.inputs.0.type]
2022-11-16T22:51:43.182+0800 INFO [crawler] beater/crawler.go:148 Starting input (ID: 10875199066285727974)
2022-11-16T22:51:43.182+0800 INFO [crawler] beater/crawler.go:106 Loading and starting Inputs completed. Enabled inputs: 1
2022-11-16T22:51:43.182+0800 INFO [input.filestream] compat/compat.go:113 Input 'filestream' starting {"id": "my-filestream-id"}
2022-11-16T22:51:43.182+0800 INFO cfgfile/reload.go:164 Config reloader started
2022-11-16T22:51:45.978+0800 INFO [add_cloud_metadata] add_cloud_metadata/add_cloud_metadata.go:101 add_cloud_metadata: hosting provider type not detected.
2022-11-16T22:52:13.139+0800 INFO [monitoring] log/log.go:184 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"id":"session-3.scope"},"memory":{"id":"session-3.scope","mem":{"usage":{"bytes":991010816}}}},"cpu":{"system":{"ticks":80,"time":{"ms":83}},"total":{"ticks":230,"time":{"ms":241},"value":230},"user":{"ticks":150,"time":{"ms":158}}},"handles":{"limit":{"hard":65536,"soft":65536},"open":12},"info":{"ephemeral_id":"b424e54f-7dc5-4082-8732-ebf650bc514b","uptime":{"ms":33645},"version":"7.17.7"},"memstats":{"gc_next":20313288,"memory_alloc":11016552,"memory_sys":32850952,"memory_total":55476336,"rss":100814848},"runtime":{"goroutines":33}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"reloads":1,"scans":2},"output":{"events":{"active":0},"type":"console"},"pipeline":{"clients":2,"events":{"active":0},"queue":{"max_events":4096}}},"registrar":{"states":{"current":0}},"system":{"cpu":{"cores":1},"load":{"1":0.07,"15":0.05,"5":0.04,"norm":{"1":0.07,"15":0.05,"5":0.04}}}}}}
{
"@timestamp": "2022-11-16T14:52:17.191Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.17.7"
},
"host": {
"os": {
"name": "Fedora Linux",
"kernel": "5.14.10-300.fc35.x86_64",
"type": "linux",
"platform": "fedora",
"version": "35 (Workstation Edition)",
"family": "redhat"
},
"id": "2dd7538b08ab45fa97d32ced12db623b",
"containerized": false,
"ip": [
"192.168.1.9",
"2408:825c:6e2:b8b9:97ac:336e:7bb7:af46",
"fe80::3932:ba1d:da41:e31d"
],
"name": "lazyfennec",
"mac": [
"08:00:27:73:59:06"
],
"hostname": "lazyfennec",
"architecture": "x86_64"
},
"agent": {
"type": "filebeat",
"version": "7.17.7",
"hostname": "lazyfennec",
"ephemeral_id": "b424e54f-7dc5-4082-8732-ebf650bc514b",
"id": "7be4fff8-9b20-48a8-9793-7aacbb903281",
"name": "lazyfennec"
},
"log": {
"offset": 72,
"file": {
"path": "/var/log/filebeat/01.log"
}
},
"message": "this is a test log ++++++++++++++++",
"input": {
"type": "filestream"
},
"ecs": {
"version": "1.12.0"
}
}
3. Logstash
日志信息只是作为一个文本字段放入ES中,还是应该将其解析为多个特定意义的字段,方便统计分析?
- Logstash的角色
https://www.elastic.co/cn/products/logstash
Logstash是开源的服务端数据处理管道,能够同时从多个数据源采集数据、转换数据,然后将数据发送到你的存储中(在ELK中特指ElasticSearch) 。
-
Logstash Pipeline 管道 工作原理
Logstash 的安装,解压即用
访问地址:https://www.elastic.co/cn/downloads/past-releases/logstash-7-17-7
然后下载相应的版本,这里我下载的是7.17.7-linux_X86_64版本的,地址如下:
https://artifacts.elastic.co/downloads/logstash/logstash-7.17.7-linux-x86_64.tar.gz
上传到linux,然后解压
tar -zxvf logstash-7.17.7-linux-x86_64.tar.gz
然后测试一下
cd logstash-7.17.7/bin/
./logstash -e 'input { stdin { } } output { stdout {} }'
输出结果
[elastic@lazyfennec bin]$ ./logstash -e 'input { stdin { } } output { stdout {} }'
Using JAVA_HOME defined java: /etc/softwares/jdk1.8
WARNING: Using JAVA_HOME while Logstash distribution comes with a bundled JDK.
DEPRECATION: The use of JAVA_HOME is now deprecated and will be removed starting from 8.0. Please configure LS_JAVA_HOME instead.
Sending Logstash logs to /home/elastic/es7/logstash-7.17.7/logs which is now configured via log4j2.properties
[2022-11-16T23:16:53,131][INFO ][logstash.runner ] Log4j configuration path used is: /home/elastic/es7/logstash-7.17.7/config/log4j2.properties
[2022-11-16T23:16:53,212][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.17.7", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 Java HotSpot(TM) 64-Bit Server VM 25.333-b02 on 1.8.0_333-b02 +indy +jit [linux-x86_64]"}
[2022-11-16T23:16:53,221][INFO ][logstash.runner ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[2022-11-16T23:16:55,008][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2022-11-16T23:16:55,150][INFO ][logstash.agent ] No persistent UUID file found. Generating new UUID {:uuid=>"92ab1157-0581-4496-b0d0-8b03d5ca1e45", :path=>"/home/elastic/es7/logstash-7.17.7/data/uuid"}
[2022-11-16T23:17:02,061][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2022-11-16T23:17:05,985][INFO ][org.reflections.Reflections] Reflections took 339 ms to scan 1 urls, producing 119 keys and 419 values
G^H^H^H[2022-11-16T23:17:14,441][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, "pipeline.sources"=>["config string"], :thread=>"#<Thread:0x619814c run>"}
[2022-11-16T23:17:17,382][INFO ][logstash.javapipeline ][main] Pipeline Java execution initialization time {"seconds"=>2.91}
[2022-11-16T23:17:17,689][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}
The stdin plugin is now waiting for input:
[2022-11-16T23:17:18,111][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
Hello Logstash # 这个是自己输入的,回车后输出下面的内容
{
"@version" => "1",
"message" => "Hello Logstash",
"@timestamp" => 2022-11-16T15:18:00.640Z,
"host" => "lazyfennec"
}
4. 打通Filebeat 和 Logstash 以及 ES (这里默认没有开启XPack,因为好像有那么一点点麻烦,虽然我还没验证,但是先这样吧)
-
配置logstash的
input
来自Filebeat
参考下面的网址:https://www.elastic.co/guide/en/logstash/7.17/advanced-pipeline.html
查看相关网页后,我们发现里边有一个配置,那么我们将这个配置拷贝下来,并且创建在logstash-7.17.7/config/beats2es.conf
这个文件中,注意,这里只是将内容输出到控制台,但是后续我们会将内容输出到es中,所以先这样吧。
input {
beats {
port => "5044"
}
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
stdout { codec => rubydebug }
}
-
启动Logstash
其中的--config.reload.automatic 表示如果有修改的情况下自动重新加载配置
bin/logstash -f config/beats2es.conf --config.reload.automatic
-
对Filebeat的输出进行重新配置,将其配置为输出到Logstash
修改filebeat.yml,将原来的console输出方式注释掉,将logstash的注释放开,并且将hosts设置为要输出到的服务器IP和端口的形式,具体建议查看下面的网址,我觉得解释的还是挺详细的,另外也可以设置多个host哦,毕竟是hosts吗。
配置filebeat的input的网页地址:https://www.elastic.co/guide/en/beats/filebeat/7.17/filebeat-input-log.html
# ------------------------------ Logstash Output -------------------------------
output.logstash:
# The Logstash hosts
hosts: ["192.168.1.9:5044"]
# ------------------------------ Console Output -------------------------------
#output.console:
# pretty: true
-
启动filebeat
这里记得最好【 一定 】要在logstash启动之后再启动,因为收集到的内容要发送到相关的端口,如果端口没有启动,可能会发生一些问题。
./filebeat -e
-
测试一下
在filebeat.yml 配置中指定的目录路径下,执行下面的指令,其实就是将一句话输出到某个文件中。
# 这里是进入相关的日志目录
cd /var/log/filebeat/
# 输入下面的内容到01.log
echo "This is a log test++++++++++++++++++++++" >> 01.log
然后切换到logstash可以看到以下的内容
-
打通Logstash到Elasticsearch
修改 logstash-7.17.7/config/beats2es.conf,主要是注释stdout { codec => rubydebug },然后添加elasticsearch相关的内容,具体可以查看 https://www.elastic.co/guide/en/logstash/7.17/advanced-pipeline.html
input {
beats {
port => "5044"
}
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
# stdout { codec => rubydebug }
elasticsearch {
hosts => [ "192.168.1.9:9200" ]
}
}
然后等待刷新,启动Kibana(你说我们没说到Kibana? 那还不简单,下载解压然后稍稍配置一下就好了),这里放一下kibana.yml
# 这里的IP要换成自己的ES服务器的IP哦
server.host: "192.168.1.9"
elasticsearch.hosts: ["http://192.168.1.9:9200"]
kibana.index: ".kibana"
# The URLs of the Elasticsearch instances to use for all your queries.
elasticsearch.hosts: ["http://192.168.1.9:9200"]
# Kibana uses an index in Elasticsearch to store saved searches, visualizations and
# dashboards. Kibana creates a new index if the index doesn't already exist.
kibana.index: ".kibana"
- 测试一下
cd /var/log/filebeat/
echo "this is a log for elk test++++++++++++++++" >> 01.log
打开kibana,查看indices
然后打开Dev Tools
GET /logstash-2022.11.16-000001/_search
出现了结果
5. Kibana
kibana 用户手册:https://www.elastic.co/guide/cn/kibana/current/index.html
6. 架构分享
-
我们上面的内容的架构类型:
多台的logstash会很耗费资源,不是很好
-
另一种好的方式,将内容发送到kafka集群,然后配置发送到logstash,这样的话,就
关于kafaka,这里就先不介绍了,后续有空会进行相关的介绍
如果觉得有收获,欢迎点赞和评论,更多知识,请点击关注查看我的主页信息哦~