合并多行數據:
- # with an input plugin:
- # you can also use this codec with an output.
- input {
- file {
- codec => multiline {
- charset => ... # string, one of ["ASCII-8BIT", "Big5", "Big5-HKSCS", "Big5-UAO", "CP949", "Emacs-Mule", "EUC-JP", "EUC-KR", "EUC-TW", "GB18030", "GBK", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-10", "ISO-8859-11", "ISO-8859-13", "ISO-8859-14", "ISO-8859-15", "ISO-8859-16", "KOI8-R", "KOI8-U", "Shift_JIS", "US-ASCII", "UTF-8", "UTF-16BE", "UTF-16LE", "UTF-32BE", "UTF-32LE", "Windows-1251", "GB2312", "IBM437", "IBM737", "IBM775", "CP850", "IBM852", "CP852", "IBM855", "CP855", "IBM857", "IBM860", "IBM861", "IBM862", "IBM863", "IBM864", "IBM865", "IBM866", "IBM869", "Windows-1258", "GB1988", "macCentEuro", "macCroatian", "macCyrillic", "macGreek", "macIceland", "macRoman", "macRomania", "macThai", "macTurkish", "macUkraine", "CP950", "CP951", "stateless-ISO-2022-JP", "eucJP-ms", "CP51932", "GB12345", "ISO-2022-JP", "ISO-2022-JP-2", "CP50220", "CP50221", "Windows-1252", "Windows-1250", "Windows-1256", "Windows-1253", "Windows-1255", "Windows-1254", "TIS-620", "Windows-874", "Windows-1257", "Windows-31J", "MacJapanese", "UTF-7", "UTF8-MAC", "UTF-16", "UTF-32", "UTF8-DoCoMo", "SJIS-DoCoMo", "UTF8-KDDI", "SJIS-KDDI", "ISO-2022-JP-KDDI", "stateless-ISO-2022-JP-KDDI", "UTF8-SoftBank", "SJIS-SoftBank", "BINARY", "CP437", "CP737", "CP775", "IBM850", "CP857", "CP860", "CP861", "CP862", "CP863", "CP864", "CP865", "CP866", "CP869", "CP1258", "Big5-HKSCS:2008", "eucJP", "euc-jp-ms", "eucKR", "eucTW", "EUC-CN", "eucCN", "CP936", "ISO2022-JP", "ISO2022-JP2", "ISO8859-1", "CP1252", "ISO8859-2", "CP1250", "ISO8859-3", "ISO8859-4", "ISO8859-5", "ISO8859-6", "CP1256", "ISO8859-7", "CP1253", "ISO8859-8", "CP1255", "ISO8859-9", "CP1254", "ISO8859-10", "ISO8859-11", "CP874", "ISO8859-13", "CP1257", "ISO8859-14", "ISO8859-15", "ISO8859-16", "CP878", "CP932", "csWindows31J", "SJIS", "PCK", "MacJapan", "ASCII", "ANSI_X3.4-1968", "646", "CP65000", "CP65001", "UTF-8-MAC", "UTF-8-HFS", "UCS-2BE", "UCS-4BE", "UCS-4LE", "CP1251", "external", "locale"] (optional), default: "UTF-8"
- multiline_tag => ... # string (optional), default: "multiline"
- negate => ... # boolean (optional), default: false
- pattern => ... # string (required)
- patterns_dir => ... # array (optional), default: []
- what => ... # string, one of ["previous", "next"] (required)
- }
- }
- }
negate字段是一個(gè)選擇開(kāi)關(guān),可以正向匹配和反向匹配
參考:https://github.com/chenryn/logstash-best-practice-cn/blob/master/codec/multiline.md
參考:http://www.logstash.net/docs/1.4.2/codecs/multiline
拷貝@timestamp字段: - filter {
- ruby {
- code => "event['read_time'] = event['@timestamp']"
- }
- mutate
- {
- add_field => ["read_time_string", "%{@timestamp}"]
- }
- }
參考:http://stackoverflow.com/questions/25189872/logstash-how-to-make-a-copy-of-the-timestamp-field-while-maintaining-the-same
多行匹配:
在和 codec/multiline 搭配使用的時(shí)候,需要注意一個(gè)問(wèn)題,grok 正則和普通正則一樣,默認是不支持匹配回車(chē)換行的。就像你需要 =~ //m 一樣也需要單獨指定,具體寫(xiě)法是在表達式開(kāi)始位置加 (?m) 標記。如下所示:
match => { "message" => "(?m)\s+(?\d+(?:\.\d+)?)\s+"}此段原文來(lái)自:https://github.com/chenryn/logstash-best-practice-cn/blob/master/filter/grok.md
最終的配置文件:
- input {
- file {
- type => "type"
- path => ["info.log"]
- exclude => ["*.gz", "access.log"]
- codec => multiline {
- pattern => "^2015"
- negate => true
- what => "previous"
- }
- }
- }
-
- filter {
- grok {
- match => {
- "message" => "(?m)%{TIMESTAMP_ISO8601:logtime}"
- }
- }
- ruby {
- code => "event['readtime'] = event['@timestamp']"
- }
- date {
- #locale => "en"
- match => ["logtime", "YYYY-MM-dd HH:mm:ss"]
- #timezone => "UTC"
- #target => "logtimestamp"
- remove_field => [ "logtime"]
- }
- }
-
- output {
- stdout {}
- redis {
- host => "127.0.0.1"
- port => 6379
- data_type => "list"
- key => "key_count"
- }
- }
grok內置正則表達式:https://github.com/elasticsearch/logstash/blob/v1.4.2/patterns/grok-patterns
本站僅提供存儲服務(wù),所有內容均由用戶(hù)發(fā)布,如發(fā)現有害或侵權內容,請
點(diǎn)擊舉報。