5

We allow the client to define custom analyzers at the time they create an index. We would prefer to specify this in json to provide maximum flexibility and understandability via the underlying ElasticSearch documentation.

I would like to create an index using an arbitrary description of analyzers, mappers, etc., defined in a json string. Using sense, my command is

PUT /my_index
{
    "settings": 
    {
        "analysis": 
        {
            "char_filter" : 
            {
                "my_mapping" : 
                {
                    "type" : "mapping",
                    "mappings" : [".=>,", "'=>,"]
                }
            },
            "analyzer": 
            {
                "my_analyzer": 
                {
                    "type":         "custom",
                    "tokenizer":    "standard",
                    "filter":       ["lowercase" ],
                    "char_filter" : ["my_mapping"]
                }
            }
         }
      }
   }
}

Ideally, my code would look something like

string json = RetrieveJson();
ElasticSearchClient client = InitializeClient();
client.CreateIndexUsingJson( json ); // this is the syntax I can't figure out

The post here attempts to do this by instantiating an IndexSettings then calling Add( "analysis", json ), but Add is not a function on the ElasticSearch library version I'm using.

The options I can imagine include:

  1. Somehow using the ElasticClient.Raw.IndicesCreatePost or something analogous
  2. Deserializing the json string into an IndexSettings object via IndexSettingsConverter.ReadJson(), and then applying that through ElasticClient.CreateIndex(ICreateIndexRequest)

Both of these mechanisms have very scant documentation.

I'm absolutely trying to avoid the lambda function versions of CreateIndex, since it would be miserable to translate the user's json into lamdba expressions, only to immediately translate them back into json deep in NEST.

Other options or concrete examples of #1 or #2 above are very much appreciated, as is a recommended approach to solving this problem.

Community
  • 1
  • 1
mcating
  • 1,062
  • 11
  • 18

3 Answers3

7

Easiest solution was an implementation of Option #1 from the original question.

public void CreateIndex(string indexName, string json)
{
    ElasticClient client = GetClient();
    var response = _client.Raw.IndicesCreatePost(indexName, json);
    if (!response.Success || response.HttpStatusCode != 200)
    {
        throw new ElasticsearchServerException(response.ServerError);
    }
}

After tinkering around with converters and JsonReaders and JsonSerializers, I found that the IndexSettingsConverter didn't seem to properly deserialize arbitrary settings json into a valid IndexSettings object. Sensing a rabbit hole, I took Manolis' suggestion and figured out how to apply the arbitrary json directly against the ElasticClient.IElasticsearchClient to avoid having to reverse-engineer security and connection details.

Painful effort to come to this conclusion, and completely impossible without working through a whole lot of undocumented NEST code.

mcating
  • 1,062
  • 11
  • 18
2

If you want to do something like the one you describe above then you could simply use an HttpClient and send the request for creating the index to your elasticsearch server. At this case, you can include your JSON in the content of the request.

Try the below:

public async void CreateIndex() {
            using (var httpClient = new HttpClient()) {
                using (var request = new HttpRequestMessage(HttpMethod.Put, new Uri("http://elastic_server_ip/your_index_name"))) {
                    var content = @"{ ""settings"" : { ""number_of_shards"" : 1 } }";
                    request.Content = new StringContent(content);
                    var response = await httpClient.SendAsync(request);
                }
            }
        }

This specific snippet will create an index to the specified endpoint with one shard, one replica (default) and default settings and mappings. Change the content variable with your json.

Manolis
  • 728
  • 8
  • 24
  • 1
    This would work in some cases, but our ElasticSearch instance is secured through SSL and jetty. Rather than creating an HttpClient and then reverse-engineering the permission/connection details, a better answer for us is to use those connection details already embedded into an ElasticClient instance. Thanks for the assist! – mcating Mar 16 '15 at 02:38
  • @mcating There's nothing insecure or that requires reverse-engineering by using direct HTTP calls. This is what we do in all our systems (behind Nginx). It's simply easier to reuse our existing ES knowledge than having to learn a library. There's nothing about this that would only work in "some cases". This is how all the plugins and extensions, it's how we all interact with ES with curl, and and is what you'll see over the wire with NEST. See my internals doc at https://netfxharmonics.com/2015/11/learningelasticps for more info. – David Betz Oct 13 '16 at 21:51
  • @DavidBetz: My concern is conceptual. By going direct to an HttpClient, you expose yourself to implementation details that could change over time. By sticking with ElasticClient, which conceptually encapsulates all required connection details, the client becomes more resilient to underlying implementation details. – mcating Oct 18 '16 at 22:36
  • @mcating The HTTP interface is exactly the thing that's a core selling point for many people (vs. MongoDB). Also, these are interfaces, not implementation details; these are versioned Web APIs. Your interface is not NEST, your interface is what the HTTP endpoint tells you. This is RESTful architecture 101. I'd love to see NEST deprecated. You talk as if we're talking about "internals" and "private" members. These are public APIs that we should encourage people to use. Seriously, have you never seen the docs? They are about the HTTP API-- they aren't "internals" or "implemenation details". – David Betz Oct 19 '16 at 00:52
  • @DavidBetz: I think I was unclear in my wording. The "internals" and "reverse-engineering" referred to the custom authentication schemes we configured, implemented and encapsulated in GetClient(), which returned a fully-usable ElasticClient. Switching to a mixed model of ElasticClient/HttpClient would confuse library consumers and increase maintenance costs as connection/configuration/authentication details evolved. We decided to use NEST, so there was strong motivation to have all clients use the same library. The question is answered for me at this point. Thank you for your time! – mcating Oct 19 '16 at 16:29
0

Okay after updating to Elasticsearch NEST v6.0.2 I had to modify my code and wanted to post it out here for someone else. The changes include passing CreateResponse in as a type for the function and using the ApiCall property on the response.

public bool CreateIndex(string indexName, string json)
{
    var response = _client.LowLevel.IndicesCreate<CreateResponse>(indexName, json);
    return response.ApiCall.Success;
}  

Hope this saves someone time!

Airn5475
  • 2,452
  • 29
  • 51