好程序員大數(shù)據(jù)學習路線之Logstach與flume對比,沒有集群的概念,logstach與flume都稱為組
我們提供的服務有:成都網(wǎng)站建設(shè)、成都網(wǎng)站設(shè)計、微信公眾號開發(fā)、網(wǎng)站優(yōu)化、網(wǎng)站認證、余干ssl等。為超過千家企事業(yè)單位解決了網(wǎng)站和推廣的問題。提供周到的售前咨詢和貼心的售后服務,是有科學管理、有技術(shù)的余干網(wǎng)站制作公司
logstash是用JRuby語言開發(fā)的
組件的對比:
logstach : input ?filter ?output
flume ???: source ?channel ?sink ?
優(yōu)劣對比:
logstach :
?安裝簡單,安裝體積小
?有filter組件,使得該工具具有數(shù)據(jù)過濾,數(shù)據(jù)切分的功能
?可以與ES無縫結(jié)合
?具有數(shù)據(jù)容錯功能,在數(shù)據(jù)采集的時候,如果發(fā)生宕機或斷開的情況,會斷點續(xù)傳(會記錄讀取的偏移量)
綜上,該工具主要用途為采集日志數(shù)據(jù)
flume:
?高可用方面要比logstach強大
?flume一直在強調(diào)數(shù)據(jù)的安全性,flume在數(shù)據(jù)傳輸過程中是由事務控制的
?flume可以應用在多類型數(shù)據(jù)傳輸領(lǐng)域
數(shù)據(jù)對接
將logstach.gz文件上傳解壓即可
可以在logstach目錄下創(chuàng)建conf文件,用來存儲配置文件
一??命令啟動
1.bin/logstash -e 'input { stdin {} } output { stdout{} }' ?
stdin/stdout(標準輸入輸出流)
hello xixi 2018-09-12T21:58:58.649Z hadoop01 hello xixi hello haha 2018-09-12T21:59:19.487Z hadoop01 hello haha |
2.bin/logstash -e 'input { stdin {} } output { stdout{codec => rubydebug} }'
hello xixi { ???????"message" => "hello xixi", ??????"@version" => "1", ????"@timestamp" => "2018-09-12T22:00:49.612Z", ??????????"host" => "hadoop01" } |
3.es集群中 ,需要啟動es集群
bin/logstash -e 'input { stdin {} } output { elasticsearch {hosts => ["192.168.88.81:9200"]} stdout{} }'
輸入命令后,es自動生成index,自動mapping.
hello haha 2018-09-12T22:13:05.361Z hadoop01 hehello haha |
bin/logstash -e 'input { stdin {} } output { elasticsearch {hosts => ["192.168.88.81:9200", "192.168.88.82:9200"]} stdout{} }'
4.kafka集群中,啟動kafka集群
bin/logstash -e 'input { stdin {} } output { elasticsearch {hosts => ["192.168.88.81:9200", "192.168.88.82:9200"]} stdout{} }'
二??配置文件啟動
需要啟動zookeeper集群,kafka集群,es集群
1.與kafka數(shù)據(jù)對接
vi logstash-kafka.conf
啟動
bin/logstash -f logstash-kafka.conf ?(-f:指定文件)
在另一節(jié)點上啟動kafka消費命令
input { ??file { ????path => "/root/data/test.log" ????discover_interval => 5 ????start_position => "beginning" ??} } ? output { ????kafka { ??topic_id => "test1" ??codec => plain { ????????format => "%{message}" charset => "UTF-8" ??????} ??bootstrap_servers => "node01:9092,node02:9092,node03:9092" ????} } |
2.與kafka-es數(shù)據(jù)對接
vi logstash-es.conf
#啟動logstash
bin/logstash -f logstash-es.conf
在另一節(jié)點上啟動kafka消費命令
input { file { type => "gamelog" path => "/log/*/*.log" discover_interval => 10 start_position => "beginning" } } ? output { ????elasticsearch { index => "gamelog-%{+YYYY.MM.dd}" ????????hosts => ["node01:9200", "node02:9200", "node03:9200"] ????} } |
數(shù)據(jù)對接過程
logstach節(jié)點存放: 哪個節(jié)點空閑資源多放入哪個節(jié)點 (靈活存放)
1.啟動logstach監(jiān)控logserver目錄,把數(shù)據(jù)采集到kafka
2.啟動另外一個logstach,監(jiān)控kafka某個topic數(shù)據(jù),把他采集到elasticsearch
數(shù)據(jù)對接案例
需要啟動兩個logstach,調(diào)用各個配置文件,進行對接
1.采集數(shù)據(jù)到kafka
cd conf
創(chuàng)建配置文件: vi gs-kafka.conf
input { ??file { codec => plain { ??????charset => "GB2312" ????} ????path => "/root/basedir/*/*.txt" ????discover_interval => 5 ????start_position => "beginning" ??} } ? output { ????kafka { ??topic_id => "gamelogs" ??codec => plain { ????????format => "%{message}" charset => "GB2312" ??????} ??bootstrap_servers => "node01:9092,node02:9092,node03:9092" ????} } |
創(chuàng)建kafka對應的topic
bin/kafka-topics.sh --create --zookeeper hadoop01:2181 --replication-factor 1 --partitions 1 --topic gamelogs |
2.在hadoop01上啟動logstach
bin/logstash -f conf/gs-kafka.conf
3.在hadoop02上啟動另外一個logstach
cd logstach/conf
vi kafka-es.conf
input { ??kafka { ????type => "accesslogs" ????codec => "plain" ????auto_offset_reset => "smallest" ????group_id => "elas1" ????topic_id => "accesslogs" ????zk_connect => "node01:2181,node02:2181,node03:2181" ??} ? ??kafka { ????type => "gamelogs" ????auto_offset_reset => "smallest" ????codec => "plain" ????group_id => "elas2" ????topic_id => "gamelogs" ????zk_connect => "node01:2181,node02:2181,node03:2181" ??} } ? filter { ??if [type] == "accesslogs" { ????json { ??????source => "message" ??remove_field => [ "message" ] ??target => "access" ????} ??} ? ??if [type] == "gamelogs" { ????mutate { ??????split => { "message" => " " } ??????add_field => { ????????"event_type" => "%{message[3]}" ????????"current_map" => "%{message[4]}" ????????"current_X" => "%{message[5]}" ????????"current_y" => "%{message[6]}" ????????"user" => "%{message[7]}" ????????"item" => "%{message[8]}" ????????"item_id" => "%{message[9]}" ????????"current_time" => "%{message[12]}" ?????} ?????remove_field => [ "message" ] ???} ??} } ? output { ? ??if [type] == "accesslogs" { ????elasticsearch { ??????index => "accesslogs" ??codec => "json" ??????hosts => ["node01:9200", "node02:9200", "node03:9200"] ????} ??} ? ??if [type] == "gamelogs" { ????elasticsearch { ??????index => "gamelogs1" ??????codec => plain { ????????charset => "UTF-16BE" ??????} ??????hosts => ["node01:9200", "node02:9200", "node03:9200"] ????} ??} } |
?bin/logstash -f conf/kafka-es.conf
4.修改basedir文件中任意數(shù)據(jù)即可產(chǎn)生es的index文件
5.網(wǎng)頁數(shù)據(jù)存儲在設(shè)置的/data/esdata中
6.在網(wǎng)頁中查找指定字段
默認分詞器為term,只能查找單個漢字,query_string可以查找全漢字
分享標題:好程序員大數(shù)據(jù)學習路線之Logstach與flume對比
文章路徑:http://jinyejixie.com/article14/gdjsde.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供定制網(wǎng)站、、微信公眾號、網(wǎng)站建設(shè)、網(wǎng)站改版、云服務器
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時需注明來源: 創(chuàng)新互聯(lián)