14

I am trying to find a way in Fluent-bit config to tell/enforce ES to store plain json formatted logs (the log bit below that comes from docker stdout/stderror) in structured way - please see image at the bottom for better explanation. For example, apart from (or along with) storing the log as a plain json entry under log field, I would like to store each property individually as shown in red.

The documentation for Filters and Parsers are really poor and not clear. On top of that the forward input doesn't have a "parser" option. I tried json/docker/regex parsers but no luck. My regex is here if I have to use regex. Currently using ES (7.1), Fluent-bit (1.1.3) and Kibana (7.1) - not Kubernetes.

If anyone can direct me to an example or give one I would be much appreciated.

Thanks

{
  "_index": "hello",
  "_type": "logs",
  "_id": "T631e2sBChSKEuJw-HO4",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2019-06-21T21:34:02.000Z",
    "tag": "php",
    "container_id": "53154cf4d4e8d7ecf31bdb6bc4a25fdf2f37156edc6b859ba0ddfa9c0ab1715b",
    "container_name": "/hello_php_1",
    "source": "stderr",
    "log": "{\"time_local\":\"2019-06-21T21:34:02+0000\",\"client_ip\":\"-\",\"remote_addr\":\"192.168.192.3\",\"remote_user\":\"\",\"request\":\"GET / HTTP/1.1\",\"status\":\"200\",\"body_bytes_sent\":\"0\",\"request_time\":\"0.001\",\"http_referrer\":\"-\",\"http_user_agent\":\"curl/7.38.0\",\"request_id\":\"91835d61520d289952b7e9b8f658e64f\"}"
  },
  "fields": {
    "@timestamp": [
      "2019-06-21T21:34:02.000Z"
    ]
  },
  "sort": [
    1561152842000
  ]
}

Thanks

conf

[SERVICE]
    Flush        5
    Daemon       Off
    Log_Level    debug
    Parsers_File parsers.conf

[INPUT]
    Name   forward
    Listen 0.0.0.0
    Port   24224

[OUTPUT]
    Name  es
    Match hello_*
    Host  elasticsearch
    Port  9200
    Index hello
    Type  logs
    Include_Tag_Key On
    Tag_Key tag

ssss

BentCoder
  • 12,257
  • 22
  • 93
  • 165

2 Answers2

15

Solution is as follows.

[SERVICE]
    Flush        5
    Daemon       Off
    Log_Level    debug
    Parsers_File parsers.conf

[INPUT]
    Name         forward
    storage.type filesystem
    Listen       my_fluent_bit_service
    Port         24224

[FILTER]
    Name         parser
    Parser       docker
    Match        hello_*
    Key_Name     log
    Reserve_Data On
    Preserve_Key On

[OUTPUT]
    Name            es
    Host            my_elasticsearch_service
    Port            9200
    Match           hello_*
    Index           hello
    Type            logs
    Include_Tag_Key On
    Tag_Key         tag
[PARSER]
    Name         docker
    Format       json
    Time_Key     time
    Time_Format  %Y-%m-%dT%H:%M:%S.%L
    Time_Keep    On
    # Command      |  Decoder | Field | Optional Action
    # =============|==================|=================
    Decode_Field_As   escaped_utf8    log    do_next
    Decode_Field_As   json       log
BentCoder
  • 12,257
  • 22
  • 93
  • 165
  • 1
    Thank you very much for this answer. The documentation is simply horrendous. One question: how is your `log` entry ultimately decoded? I get a line of key=value (such as `name=john age=27 city=paris`) and not a decoded structure (it is not a JSON string aymore, but not a structure visible by Kibana either) – WoJ May 30 '20 at 18:53
  • Not sure if I understand what exactly you mean but my application logs are in JSON format by default. So your example would be `{"name":"john","age":"27","city":"paris"}` if it was my application. Afterwards this whole string would also look same in Kibana under `log` key as shown above in the image. I hope it helps. Also have a look at [this](http://www.inanzzz.com/index.php/post/rel5/using-fluent-bit-to-forward-docker-php-fpm-and-nginx-logs-to-elasticsearch) for much detailed example. – BentCoder May 30 '20 at 19:31
  • 1
    Sorry for not having been clear. I used to have `{"name":"john","age":"27","city":"paris"}` as the `message` entry in my log, displayed as such by Kibana. I was hoping that this entry can be decoded by Fluent Bit so that it goes to Elasticsearch as a true JSON entry, and so that I have the keys `name`, `age`and `city` as fields (at the same level as your entre `tag`or `source`. – WoJ May 30 '20 at 19:56
  • (cont'd) What I have is still a `message` entry which is now `name=john age=27 city=paris` (instead of the JSON string representation before). I was wondering if this is the expected behaviour (which makes the decoder useless because I cannot search on key `city` for instance) – WoJ May 30 '20 at 19:56
  • In other words, the entry under `message` has been rewritten from the string `{"name":"john","age":"27","city":"paris"}` into the string `name=john age=27 city=paris`, which is not the parsing I expected (→ to "explode" the JSON string into actual fields for Kibana) – WoJ May 30 '20 at 20:04
  • Also thanks for your link - I see that what the author got at the very end is exactly what I am looking for - so this must be something on my side. I get a JSON → key/value pairs translation instead of the expected parsing. – WoJ May 30 '20 at 20:07
  • If your app log is an JSON formatted string, you should have a `log` field in Kibana that contains the original JSON **as is**. On top of that, you should also have `name`, `age` and `city` as individual fields. All these depend on the parser so if you used the very last parser in that blog (_same as the one I have above_), it should work. Pay attention to filter bit in `fluent-bit.conf` file as well. – BentCoder May 30 '20 at 20:15
  • Hey @BentCoder, What if i has to parse log field which is not a json its a String at one line like "2020-07-11 10:55:38,022 - INFO kv : 1" if they are strictly aligned, 0 if not then what changes do we need to do on filter side – Aman Kumar Soni Jul 12 '20 at 10:50
  • @AmanKumarSoni, you need to use `format regex` (with the named capture feature) for that: https://docs.fluentbit.io/manual/pipeline/parsers/regular-expression – Mark Rajcok Feb 05 '22 at 19:35
  • How to name `Key_Data` if the key is nested. In my case it's `log_processed['message']`. And I tried: `log_processed['message']`, `log_processed.message` and `log_processed_message` none of these work. – Reda Drissi Mar 10 '22 at 15:35
-3

You can use the Fluent Bit Nest filter for that purpose, please refer to the following documentation:

https://docs.fluentbit.io/manual/filter/nest

edsiper
  • 398
  • 1
  • 4
  • 5
    OP - "The documentation for Filters and Parsers are really poor and not clear.". I've spent good enough time with the doc hence reason ended up with this question. – BentCoder Jul 29 '19 at 18:39
  • 6
    The documentation is EXTREMELY lacking – Shōgun8 Feb 15 '20 at 05:43