1

How to copy data from one index to another index in Elastic Search (which may be present in same host or different host) ?

Output

Reading config file {:file=>"logstash/agent.rb", :level=>:debug, :line=>"309", :method=>"local_config"}
Compiled pipeline code:
        @inputs = []
        @filters = []
        @outputs = []
        @periodic_flushers = []
        @shutdown_flushers = []

          @input_elasticsearch_1 = plugin("input", "elasticsearch", LogStash::Util.hash_merge_many({ "hosts" => ("input hostname") }, { "port" => ("9200") }, { "index" => (".kibana") }, { "size" => 500 }, { "scroll" => ("5m") }, { "docinfo" => ("true") }))

          @inputs << @input_elasticsearch_1

          @output_elasticsearch_2 = plugin("output", "elasticsearch", LogStash::Util.hash_merge_many({ "host" => ("output hostname") }, { "port" => 9200 }, { "protocol" => ("http") }, { "manage_template" => ("false") }, { "index" => ("order-logs-sample") }, { "document_type" => ("logs") }, { "document_id" => ("%{id}") }, { "workers" => 1 }))

          @outputs << @output_elasticsearch_2

  def filter_func(event)
    events = [event]
    @logger.debug? && @logger.debug("filter received", :event => event.to_hash)
    events
  end
  def output_func(event)
    @logger.debug? && @logger.debug("output received", :event => event.to_hash)
    @output_elasticsearch_2.handle(event)
    
  end {:level=>:debug, :file=>"logstash/pipeline.rb", :line=>"29", :method=>"initialize"}
Plugin not defined in namespace, checking for plugin file {:type=>"input", :name=>"elasticsearch", :path=>"logstash/inputs/elasticsearch", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
Plugin not defined in namespace, checking for plugin file {:type=>"codec", :name=>"json", :path=>"logstash/codecs/json", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
config LogStash::Codecs::JSON/@charset = "UTF-8" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@hosts = ["ccwlog-stg1-01"] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@port = 9200 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@index = ".kibana" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@size = 500 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@scroll = "5m" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@debug = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@codec = <LogStash::Codecs::JSON charset=>"UTF-8"> {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@add_field = {} {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@query = "{\"query\": { \"match_all\": {} } }" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@scan = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo_target = "@metadata" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo_fields = ["_index", "_type", "_id"] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@ssl = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
Plugin not defined in namespace, checking for plugin file {:type=>"output", :name=>"elasticsearch", :path=>"logstash/outputs/elasticsearch", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
'[DEPRECATED] use `require 'concurrent'` instead of `require 'concurrent_ruby'`
[2016-01-22 03:49:34.451]  WARN -- Concurrent: [DEPRECATED] Java 7 is deprecated, please use Java 8.
Java 7 support is only best effort, it may not work. It will be removed in next release (1.0).
Plugin not defined in namespace, checking for plugin file {:type=>"codec", :name=>"plain", :path=>"logstash/codecs/plain", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
config LogStash::Codecs::Plain/@charset = "UTF-8" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@host = ["ccwlog-stg1-01"] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@port = 9200 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@protocol = "http" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@manage_template = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@index = "order-logs-sample" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@document_type = "logs" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@document_id = "%{id}" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@workers = 1 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@type = "" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@exclude_tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@codec = <LogStash::Codecs::Plain charset=>"UTF-8"> {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@template_name = "logstash" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@template_overwrite = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@embedded = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@embedded_http_port = "9200-9300" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@max_inflight_requests = 50 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@flush_size = 5000 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@idle_flush_time = 1 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@action = "index" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@path = "/" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@ssl = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@ssl_certificate_verification = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@sniffing = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@max_retries = 3 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@retry_max_items = 5000 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@retry_max_interval = 5 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
Normalizing http path {:path=>"/", :normalized=>"/", :level=>:debug, :file=>"logstash/outputs/elasticsearch.rb", :line=>"342", :method=>"register"}
Create client to elasticsearch server on ccwlog-stg1-01: {:level=>:info, :file=>"logstash/outputs/elasticsearch.rb", :line=>"422", :method=>"register"}
Plugin is finished {:plugin=><LogStash::Inputs::Elasticsearch hosts=>["ccwlog-stg1-01"], port=>9200, index=>".kibana", size=>500, scroll=>"5m", docinfo=>true, debug=>false, codec=><LogStash::Codecs::JSON charset=>"UTF-8">, query=>"{\"query\": { \"match_all\": {} } }", scan=>true, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>, :level=>:info, :file=>"logstash/plugin.rb", :line=>"61", :method=>"finished"}
New Elasticsearch output {:cluster=>nil, :host=>["ccwlog-stg1-01"], :port=>9200, :embedded=>false, :protocol=>"http", :level=>:info, :file=>"logstash/outputs/elasticsearch.rb", :line=>"439", :method=>"register"}
Pipeline started {:level=>:info, :file=>"logstash/pipeline.rb", :line=>"87", :method=>"run"}
Logstash startup completed
output received {:event=>{"title"=>"logindex", "timeFieldName"=>"@timestamp", "fields"=>"[{\"name\":\"caller\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"_source\",\"type\":\"_source\",\"count\":0,\"scripted\":false,\"indexed\":false,\"analyzed\":false,\"doc_values\":false},{\"name\":\"exception\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"type\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"@version\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"serviceName\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"_type\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":false,\"doc_values\":false},{\"name\":\"_id\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":false,\"analyzed\":false,\"doc_values\":false},{\"name\":\"userId\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"path\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"orderId\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"dc\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"tags\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"host\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"_index\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":false,\"analyzed\":false,\"doc_values\":false},{\"name\":\"elapsedTime\",\"type\":\"number\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":false,\"doc_values\":false},{\"name\":\"message\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"@timestamp\",\"type\":\"date\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":false,\"doc_values\":false},{\"name\":\"performanceRequest\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false}]", "@version"=>"1", "@timestamp"=>"2016-01-22T11:49:35.268Z"}, :level=>:debug, :file=>"(eval)", :line=>"21", :method=>"output_func"}
Community
  • 1
  • 1
sri
  • 331
  • 1
  • 4
  • 11

3 Answers3

0

You should look at scan and scroll doc for this functionality.

You retrieve data from old index with given query and size parameter, and then bulk index into new index. Different languages provide wrapper to make reindexing easy.

e.g I use python and it has reindex helper which uses scan and scroll approach.

ChintanShah25
  • 12,366
  • 3
  • 43
  • 44
0

A simple way to do this is to use Logstash with an elasticsearch input plugin and an elasticsearch output plugin.

The benefit of this solution is that you don't have to rewrite boilerplate code to scan/scroll and bulk re-index which is exactly what Logstash already provides.

After installing Logstash, you can create a configuration file copy.conf that looks like this:

input {
  elasticsearch {
   hosts => ["localhost:9200"]                   <--- source ES host
   index => "source_index"
  }
}
filter {
 mutate {
  remove_field => [ "@version", "@timestamp" ]   <--- remove fields added by Logstash
 }
}
output {
 elasticsearch {
   hosts => ["localhost:9200"]                   <--- target ES host
   manage_template => false
   index => "target_index"
   document_id => "%{id}"                        <--- name of your ID field
   workers => 1
 }
}

And then after setting the correct values (source/target host + source/target index), you can run this with bin/logstash -f copy.conf

Val
  • 207,596
  • 13
  • 358
  • 360
  • Were you able to try this out? – Val Dec 10 '15 at 13:38
  • Hi Val, I tried the above thing but facing below issue : Error: [400] {"error":"SearchPhaseExecutionException[Failed to execute phase [init_scan], all shards failed; shardFailures {[eC3o5Cd7QUiR7NsSjUv10g][index][0]: RemoteTransportException[[ccwlog-stg2-01][inet[/64.102.202.10:9300]][indices:data/read/search[phase/scan]]]; nested: SearchParseException[[ccw-order-index][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [_na_]]]; nested: ElasticsearchParseException[Failed to derive xcontent from org.elasticsearch.common.bytes.ChannelBufferBytesReference@49]; }{[eC3o5Cd7QUiR7NsSjUv10g – sri Jan 22 '16 at 10:37
  • Can u help about this @Val A plugin had an unrecoverable error. Will restart this plugin. Plugin: ["host name:9200"], index=>"index", query=>"*", size=>500, scroll=>"5m", docinfo=>true, debug=>false, codec=>"UTF-8">, port=>9200, scan=>true, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false> – sri Jan 22 '16 at 10:40
  • Are you sure you have the correct hostname in your ES input? i.e. `"host name:9200"` – Val Jan 22 '16 at 10:44
  • I have it like this : host => "host name" port => 9200 I tried this also hosts => "hostname:9200" – sri Jan 22 '16 at 10:55
  • Can you run logstash with `--debug` and update your question with the output you get? – Val Jan 22 '16 at 11:00
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/101398/discussion-between-sri-and-val). – sri Jan 22 '16 at 11:52
  • there is still so much output that I haven't added here – sri Jan 22 '16 at 11:58
0

Elastic search provides a Re-index API. it helps to copy data from one index to another.but make sure Re-index does not attempt to set up the destination index. It does not copy the settings of the source index. You should set up the destination index prior to running a _reindex action, including setting up mappings, shard counts, replicas, etc.

for more information about Re-index https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

ex-- POST _reindex { "source": { "index": "old index name" }, "dest": { "index": "new index name" } }

Gaurav
  • 597
  • 6
  • 15