1

I used brisk. The cassandra column family automatically maps to Hive tables.
However, if data type is timeuuid in column family, it is unreadable in Hive tables.

For example, I used following command to create an external table in hive to map column family.

Hive > create external table A (rowkey string, column_name string, value string) 
     > STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
     > WITH SERDEPROPERTIES (
     > "cassandra.columns.mapping" = ":key,:column,:value");  

If column name is TimeUUIDType in cassandra, it becomes unreadable in the Hive table.

For example, a row in cassandra column family looks like:

RowKey: 2d36a254bb04272b120aaf79d70a3578  
        => (column=29139210-b6dc-11df-8c64-f315e3a329d6, value={"event_id":101},timestamp=1283464254261)

Where column name is TimeUUIDType.

In hive table, it looks like the following row:

 2d36a254bb04272b120aaf79d70a3578    t��ߒ4��!��   {"event_id":101}

So, column name is unreadable in Hive table.

sbridges
  • 24,960
  • 4
  • 64
  • 71
chnet
  • 1,993
  • 9
  • 36
  • 51
  • Where are you getting that rendering from? ISTM that even if a Hive tool doesn't know how to turn a timeuuid into a human-readable string, you should be able to query it fine. – jbellis Aug 04 '11 at 16:22
  • For cassandra record, I get it from cassandra console. For the hive record, I get the rendering from Hive console. – chnet Aug 04 '11 at 19:34
  • 1
    I could not query on unreadable column name. For example, In Hive, I use `select * from table a where column_name = '29139210-b6dc-11df-8c64-f315e3a329d6'`. mapper/reduce job starts, however, I do not get the row I want. It returns nothing. – chnet Aug 04 '11 at 19:35

1 Answers1

2

This is a known issue with the automatic table mapping. For best results with a timeUUIDType, turn the auto-mapping feature off in $brisk_home/resources/hive/hive-site.xml: "cassandra.autoCreateHiveSchema"

and create the table in hive manually.

zznate
  • 1,898
  • 13
  • 12
  • Right. I turned auto-mapping feature off. However, I used the command in my question to manually create external table. It uses `STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'` and `WITH SERDEPROPERTIES` to map. TimeUUIDType is still unreadable. – chnet Aug 05 '11 at 16:01
  • 1
    My apologies - I just verified we had an issue open on the hive display side as well. This has to do with how hive handles display of certain types and how we do conversions from the data coming back out of cassandra. We are actively working on this from both ends - the auto-convert side and hive side, but it has turned out to be trickier than we initially thought. Thanks for bringing it up though. Pig works correctly with these types, fwiw. – zznate Aug 05 '11 at 16:42