4

I have all keyspaces and tables copied from another cassandara data folder ,How can I restore it in my cassandara node.

I dont have snapshots which are normally required to restore.

Vikas Kumar
  • 519
  • 5
  • 16

2 Answers2

3

You might be able to do this with the Cassandra Bulk Loader.

Assuming a packaged install (with default data and bin locations), try this from one of your nodes:

$ sstableloader -d hostname1,hostname2 /var/lib/cassandra/data/yourKeyspaceName/tableName/

Check out the documentation on the Bulk Loader for more details.

Aaron
  • 55,518
  • 11
  • 116
  • 132
  • >sstableloader -d 127.0.0.1 "cassandradatapath/myKeyspace/tableName/" Output--Could not retrieve endpoint ranges: InvalidRequestException(why:No such keyspace: mykeyspace) Run with --debug to get full stack trace or --help to get help. >created keyspace ,then tried again Output--Established connection to initial hosts Opening sstables and calculating sections to stream Streaming session ID: eb09a290-d3eb-11e4-aae5-09ece297a2b4 --Restarted Service still cannot get tables in keyspace – Vikas Kumar Mar 26 '15 at 19:13
2

You can do this but:

  1. You need to know the schema for all the tables you are restoring. If you don't know this, use sstable2json (example below, but this can be tricky and requires understanding how sstable2json formats things)

  2. You will have to start a new node, create the keyspace and it's tables using the schema derived from 1 and then use the BulkLoader as described in the docs by Aaron (BryceAtNetwork23).

Example of retreiving a schema (an offline process) using sstable2json, this example assumes your keyspace name is test and the table is named example1:

sstable2json /var/lib/cassandra/data/test/example1-55639910d46a11e4b4335dbb0aaeeb24/test-example1-ka-1-Data.db

// output:
WARN  10:25:34 JNA link failure, one or more native method will be unavailable.
[
{"key": "7d700500-d46b-11e4-b433-5dbb0aaeeb24",    <-- key = bytes of what is in the PRIMARY KEY()
 "cells": [["coolguy:","",1427451885901681],       <-- cql3 row marker (empty cell that tells us table was created using cql3)
           ["coolguy:age","29",1427451885901681],  <-- age
           ["coolguy:email:_","coolguy:email:!",1427451885901680,"t",1427451885],        <-- collection cell marker 
           ["coolguy:email:6367406d61696c2e6e6574","",1427451885901681],                 <-- first entry in collection 
           ["coolguy:email:636f6f6c677579383540676d61696c2e636f6d","",1427451885901681], <-- second entry in collection
           ["coolguy:password","xQajKe2fa?af",1427451885901681]]},                       <-- another text field for password
{"key": "52641f40-d46b-11e4-b433-5dbb0aaeeb24",
 "cells": [["lyubent:","",1427451813663728],
           ["lyubent:age","109",1427451813663728],
           ["lyubent:email:_","lyubent:email:!",1427451813663727,"t",1427451813],
           ["lyubent:email:66616b65406162762e6267","",1427451813663728],
           ["lyubent:email:66616b6540676d61696c2e636f6d","",1427451813663728],
           ["lyubent:password","password",1427451813663728]]}
]

The above equates to:

CREATE TABLE test.example1 (
    id timeuuid,
    username text,
    age int,
    email set<text>,
    password text,
    PRIMARY KEY (id, username)
) WITH CLUSTERING ORDER BY (username ASC)
// the below are settings that you have no way of knowing,
// unless you are hardcore enough to start digging through
// system tables with the debug tool, but this is beyond
// the scope of the question.
    AND bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

You can see clearly that username and password get lost in the translation as they are the key, but you can tell that there is a compound key based on the fact that all cells have a section with : pre-appended, in the above two entries the examples are coolguy: and lyubent:. Going on this you know that they key is formed of PRIMARY KEY(something ?, username text). If you're lucky your primary key will be simple and debugging the schema from it will be straight forward, if not post it here and we'll see how far we can get.

Lyuben Todorov
  • 13,987
  • 5
  • 50
  • 69
  • This tool requires keyspace,tables in cassandara node and link to data.db file.But I only have *.db files,Even if i create keyspace ,i get error msg-The provided column family is not part of this cassandra keyspace: keyspace = test, column family = example1 – Vikas Kumar Mar 27 '15 at 13:25
  • @VKX >This tool requires keyspace,tables in cassandara node and link to data.db file.But I only have *.db files. So you have all the files... whats the problem? You aren't creating the keyspace you are running the bulkloader on a non-existing keyspace, thats why you get the error. You need to first CREATE THE TABLE using cql, and then run the bulkloader. – Lyuben Todorov Mar 27 '15 at 13:33