1

I want to copy my data from one cluster of Couchbase to another cluster either via curl commands or using SDK libraries. Could someone please let me know if is there any libraries/APIs available by which I can be do this.

We need to copy the data based on some queries from one cluster to another cluster on same bucket(Source and target bucket name is same). We are looking for solution which we can adopt to copy our data based on some complex query that can be copied to another cluster for availability purpose, debug purpose. We want to do it via SDK libraries or some script which we can run in Jenkins pipeline etc.

sagar verma
  • 404
  • 4
  • 10

2 Answers2

3

You could use the cbq tool together with jq to prepare a json file with an array of documents that can later be exported with cbimport:

./cbq -u Administrator -p password -e "http://localhost:8091" \
--script="SELECT * FROM \`travel-sample\`.inventory.airline LIMIT 1;" -q | jq '.results' > data.json 

You can then import the generated file using cbimport (cbimport json) with --format=list

  • 1
    Is it possible to use cbq and cbimport both at the same time. If yes, can you pls share an example. – sagar verma Nov 24 '21 at 16:26
  • I am getting below result while running the command. `{ "requestID": "7434da76-80a5-4ffd-8d7c-ba9ec0eb0ee4", "errors": [ { "code": 3000, "msg": "syntax error - at |" } }` – sagar verma Nov 25 '21 at 09:17
  • I think piping into cbimport from cbq should do the trick: `cbq ... | jq ... | cbimport ...`, and if it doesn't work directly you can try specifying "-" (dash, or minus sign) as filename for cbimport -- that is a standard linux convention for commands that tells them to read data from stdin instead of a file. – Dmitrii Chechetkin Dec 21 '21 at 15:10
  • 1
    Thanks @Dmitrii. I will try this as well. – sagar verma Dec 21 '21 at 16:51
1

It is possible to use available SDKs. You will need to write a script/program and follow the steps:

 - Make two DB connections. 
 - 1st connection to read from the source cluster. Pull data.
 - Open 2nd connection to the destination cluster and save the results.

However, copying data this way can have a number of complexities, like conflict-resolution, stale data read, CAS mismatch errors, memory overflow etc.

I would like to suggest an alternate approach here. You can use existing CB Replication mechanism(XDCR). It is quite stable and robust.

You can try out this approach:

 - Run N1QL query and store the results in a new bucket(source cluster).
 - Use CB's replication machanism(XDCR) to replicate the new bucket between the source and destination cluster.

This way you will only need to write a script to run the N1QL and generate docs in a new bucket. All the replication/sync will be handled by CB itself.

Udit Jindal
  • 151
  • 8