3

I have documents, that look similar to below:

{
    dateTime: /* My time field */,
    message: {
        users: ['1', '2']
    },
    messageType: 'test'
}

I'd like to construct a timelion series chart that shows me a cumulative sum of the count of the array message.users. My first inkling was to create a script:

if(doc.containsKey('message.users')) {
    return doc['message.users'].length;
} else {
    return 0;
}

From what I could tell, doc.containsKey('message.users') always was false, which tells me that it may not have been indexed correctly. I've tried numerous Timelion, all to no avail:

.es(index=logstash-*,timefield='dateTime',q='messageType:UserList').label('Users Online')

I index my document through the c# NEST api like so:

elasticClient.Index(
    new
    {
        DateTime = DateTime.Now,
        Message = evt.EventArgs.Message,
    },
    idx => idx.Index($"logstash-{evt.MessageCode}"));
Blue
  • 22,608
  • 7
  • 62
  • 92
  • Can you share the mapping of your `message.users` field? – Val Jul 27 '17 at 03:54
  • Another good practice is to simply create another field at indexing time called `userCount` which contains the number of users in your `message.users` array. – Val Jul 27 '17 at 04:02
  • @Val, sorry new to elasticsearch. I just used the C# Nest API, and I've updated my answer with how I'm indexing the document. – Blue Jul 27 '17 at 16:30

1 Answers1

3

I suggest to add another field called userCount to your documents so you don't need to mess with scripting (+ it'll be more performant).

So your documents should look like this:

{
    dateTime: /* My time field */,
    message: {
        users: ['1', '2']
    },
    userCount: 2,                  <--- add this field
    messageType: 'test'
}

Solution 1:

You'd need to change your code a tiny bit to this:

elasticClient.Index(
    new
    {
        DateTime = DateTime.Now,
        Message = evt.EventArgs.Message,
        UserCount = evt.EventArgs.Message.Users.Length
    },
    idx => idx.Index($"logstash-{evt.MessageCode}"));

Solution 2:

If you're using ES 5, you can leverage the Ingest API in order to create a pipeline that will automatically add that userCount field for you. You don't have to change anything in your code.

PUT _ingest/pipeline/user-count-pipeline
{
  "description" : "Creates a new userCount field",
  "processors" : [
    {
      "script": {
        "lang": "painless",
        "inline": "ctx.userCount = ctx.message?.users?.size() ?: 0"
      }
    }
  ]
}

Then, in Timelion, it'll be very easy to chart what you need using metric='sum:userCount' to sum the userCount values and the cusum() function to get the cumulative sum of the userCountover time. The whole expression would look like this:

.es(index=logstash-*,timefield='dateTime',q='messageType:UserList',metric='sum:userCount').label('Users Online').cusum()

Using a few sample documents, the time series looks like this, which seems to be what you're looking for.

Users online

Val
  • 207,596
  • 13
  • 358
  • 360
  • Is it possible to do it without creating the additional field `UserCount`? – Blue Jul 28 '17 at 06:40
  • It would be cool if it was possible to define a scripted field (e.g. `return params['_source'].message.users.size()`) inside Kibana and then use it in Timelion, but unfortunately [it is not supported yet](https://github.com/elastic/kibana/issues/9022). – Val Jul 28 '17 at 07:02
  • What about indexing the user array, and using a function to get the size of the users array? – Blue Jul 31 '17 at 03:20
  • If you're using ES 5, I have another solution. See my updated answer. – Val Jul 31 '17 at 03:36