0

I am using Logstash to update by query existing Elasticsearch documents with an additional field that contains aggregate values extracted from Potgresql table. I use elastichsearch output to load one index using document_id and http output to update another index that have different document_id but receving errors:

[2023-02-08T17:58:12,086][ERROR][logstash.outputs.http ][main][b64f19821b11ee0df1bd165920785876cd6c5fab079e27d39bb7ee19a3d642a4] [HTTP Output Failure] Encountered non-2xx HTTP code 400 {:response_code=>400, :url=>"http://localhost:9200/medico/_update_by_query", :event=>#LogStash::Event:0x19a14c08}

This is my pipeline configuration:

input {
    jdbc {
        # Postgres jdbc connection string to our database, mydb
        jdbc_connection_string => "jdbc:postgresql://handel:5432/mydb"
        statement_filepath => "D:\ProgrammiUnsupported\logstash-7.15.2\config\nota_sede.sql"
    }
}

filter {
    aggregate {
        task_id => "%{idCso}"
        code => "
            map['idCso'] = event.get('idCso')
            map['noteSede'] ||= []
            map['noteSede'] << {
                'id' => event.get('idNota'),
                'tipo' => event.get('tipoNota'),
                'descrizione' => event.get('descrizione'),
                'data' => event.get('data'),
                'dataInizio' => event.get('dataInizio'),
                'dataFine' => event.get('dataFine')
            }
            event.cancel()"
        push_previous_map_as_event => true
        timeout => 60
        timeout_tags => ['_aggregatetimeout']       
    }
   }
}

output {

    stdout { codec => rubydebug { metadata => true } }

#       this works
    elasticsearch {
        hosts => "https://localhost:9200"
        document_id => "STRUTTURA_%{idCso}" 
        index => "struttura"
        action => "update"
        user => "user"
        password => "password"
        ssl => true
        cacert => "/usr/share/logstash/config/ca.crt"   
    }
    
    http {
        url => "http://localhost:9200/medico/_update_by_query"
        user => "elastic"
        password => "changeme"
        http_method => "post"
        format => "message"
        content_type => "application/json"
        message => '{
                        "query":{
                            "term":{
                                "idCso":"%{idCso}"
                            }
                        },
                        "script":{
                            "source":"ctx._source.noteSede=params.noteSede",
                            "lang":"painless",
                            "params":{
                                "noteSede":"%{noteSede}"
                                }
                            }
                        }
                    }'
    }
}

The stdout output show me the sended docs to output like this:

{
     "query" => {
        "term" => {
            "idCso" => "859119"
        }
    },
    "script" => {
        "source" => "ctx._source.noteSede=params.noteSede",
        "lang" => "painless",
        "params" => {
            "noteSede" => "{dataFine=null, dataInizio=2020-02-13, descrizione=?, tipo=DB, id=6390644, data=2020-02-13 12:26:58.409},{dataFine=null, dataInizio=2020-02-13, descrizione=?, tipo=DE, id=6390645, data=2020-02-13 12:26:58.41}"
        }
        }
    }
}

How could I set noteSede array field into message to _update_by_query ?

Carlitoz
  • 1
  • 1
  • What happens if you try to run the same update by query from Kibana Dev Tools? What response do you get? Can you find any error logs in the ES server logs? – Val Feb 09 '23 at 08:40
  • Hi @Val, from Postman the following command works. { "query": { "term": { "idCso":22868 } }, "script" : { "source" : "ctx._source.noteSede=params.noteSede", "lang" : "painless", "params": { "noteSede" : [ {"dataInizio":"2020-02-13", "descrizione":"?", "tipo":"DB", "id":6390644, "data":"2020-02-13 12:26:58.409"}, {"dataInizio":"2020-02-13", "descrizione":"?", "tipo":"DE", "id":6390645, "data":"2020-02-13 12:26:58.41"} ] } } } – Carlitoz Feb 09 '23 at 12:22
  • I think the problems stems from the fact that you're stringifying `noteSede`(i.e. `"noteSede": "%{noteSede}"`), whereas when you execute via Postman, you're not (i.e. `"noteSede" : [ {"d`) – Val Feb 09 '23 at 12:54
  • @Val I added timeout_code => "event.set('noteSede', event.get('noteSede').to_json)" but still getting this error from Logstash: [2023-02-09T13:47:02,441][ERROR][logstash.outputs.http ][main][432ed526006441ae331331c675bb9d8c124a6fd12ca1f2b274888cebc7ef5233] [HTTP Output Failure] Encountered non-2xx HTTP code 406 {:response_code=>406, :url=>"http://localhost:9200/struttura/_update_by_query", :event=>#} No error founded into Elastic server. – Carlitoz Feb 09 '23 at 14:24
  • Have you tried removing the double quotes ? `"noteSede":"%{noteSede}"` => `"noteSede":%{noteSede}` – Val Feb 09 '23 at 14:34
  • Yes, get same error. I tried also with static code "noteSede":"xxx" and it doesn't work. – Carlitoz Feb 09 '23 at 16:27
  • Can you run this so we get some [debug logging](https://github.com/logstash-plugins/logstash-output-http/blob/main/lib/logstash/outputs/http.rb#L278-L284) in Logstash: `curl -XPUT 'localhost:9600/_node/logging?pretty' -H 'Content-Type: application/json' -d'{ "logger.logstash.outputs.http" : "DEBUG"}'` – Val Feb 09 '23 at 16:59
  • Debug log add these additional info at beginning: [2023-02-10T09:33:37,545][DEBUG][logstash.outputs.http ] config LogStash::Outputs::Http/@message = "{\n\t\t\t\t\t\t\"query\":{\n\t\t\t\t\t\t\t\"term\":{\n\t\t\t\t\t\t\t\t\"idCso\":\"%{idCso}\"\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t},\n\t\t\t\t\t\t\"script\":{\n\t\t\t\t\t\t\t\"source\":\"ctx._source.noteSede = params.noteSede\",\n\t\t\t\t\t\t\t\"lang\":\"painless\",\n\t\t\t\t\t\t\t\"params\":{\n\t\t\t\t\t\t\t\t\"noteSede\":\"xxx\"\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t}\n\t\t\t\t\t}" – Carlitoz Feb 10 '23 at 09:33
  • It would be nice if you could share the full log (e.g. via [gist](https://gist.github.com/)) – Val Feb 10 '23 at 09:50
  • Hi, I share configurations and log on this [link](https://gist.github.com/Carlitoz72/3323cbfd3a666d45f20112798f841f5a) – Carlitoz Feb 16 '23 at 10:08
  • Thanks, but can you add the DEBUG logs to that gist? – Val Feb 16 '23 at 10:22
  • Sorry, I update the log configuation with debug [link](https://gist.github.com/Carlitoz72/3323cbfd3a666d45f20112798f841f5a) – Carlitoz Feb 16 '23 at 16:12
  • It doesn't contain the DEBUG messages from `logstash.outputs.http` like you had in your comment above. You need to re-rerun the curl above everytime you restart Lostash – Val Feb 16 '23 at 16:14
  • @Val Cannot appy debug logging with curl because logstash crush as soon as it's started. – Carlitoz Feb 20 '23 at 13:12

1 Answers1

0

I found the trick using ruby code for setting params array end set the format of http output to json. Possibile code optimization but it works!

ruby {
    code => '
        temp = event.get("noteSede")
        note = {"noteSede" => temp}
        event.set("script", "params" => note)
    '
}
mutate {
    add_field => {
        "[script][lang]" => "painless"
        "[query][term][idSede]" => '%{idSede}'
        "[script][source]" => "ctx._source.noteSede = params.noteSede"
    }
    remove_field => ["tags", "idSede", "noteSede", "@version", "@timestamp"]            
}   

Bye

Carlitoz
  • 1
  • 1