1

I am a newbie to Cassandra and Hadoop. While looking for integration of the two products i came across Brisk. From the description i understand that Brisk replaces HDFS for CassandraFS. So this replacement is a solution for small file problem of Hadoop? If so what about large files ? Currently i need to implement a resource storage containing both large binary data files with their meta data and small files such as images.

fgakk
  • 1,289
  • 1
  • 15
  • 26

1 Answers1

0

It's both, really (although I think Brisk has now been rolled into a commercial product, DataStax Enterprise, and isn't being actively developed in its own right).

Brisk includes CassandraFS (cfs) which is a drop-in replacement for HDFS, so supports large files. Under the hood, these are broken into chunks and stored in Cassandra rows/columns.

For small files, you can store the data in native Cassandra rows instead of CassandraFS, and run Hadoop jobs over the rows instead.

DNA
  • 42,007
  • 12
  • 107
  • 146