1

I would like to store JSON data in a column and have it analyzed by the DSE search as a document and not as a text field.

I cannot coerce the JSON docs into a table because they do not follow a common schema (or any reasonably sized set of schema)

What I have currently is a working wildcard search over a large text field which performs poorly and does not allow for more sophisticated queries.

I have read that SOLR supports nested documents but the documentation is not enough to be applied to DSE. There seems to be no SOLR field type for nested docs and I have no idea how to apply restrictions to object names like _childDocuments_ as seen here

Is it possible to have DSE search handle fields/columns as separate or nested documents and if yes, how do I configure and use it?

Thank you

kostja
  • 60,521
  • 48
  • 179
  • 224
  • Have you looked into UDT? There is a really good Datastax writeup on it here. http://www.datastax.com/dev/blog/tuple-and-udt-support-in-dse-search – mando222 Jul 20 '16 at 17:07
  • IIUC, defining UDTs still requires conforming to a schema, which I unfortunately dont have. Many object names/keys in the docs I am working with are just hashes which I cannot anticipate. Still, UDTs might be going in the right direction for a subset of the documents with a more consistent structure, so thanks for the tip. – kostja Jul 21 '16 at 11:55
  • @mando222 to rephrase you answer in the terms of the question - No, indexing a JSON document stored in a record as a nested/separate document is not possible with the DSE 5 release. Correct? – kostja Jul 21 '16 at 12:00

1 Answers1

2

Seems to me that if you can't use UDTs the other option is field transformers. (link below)

To answer the comment question. Indexing a JSON document stored in a record as a nested/separate document is totally possible. The main issue here is that the data doesn't seem to have any rime or reason to the format. This makes it extremely hard to handle as to make a schema you would normally use the structure of the JSON. If my understanding is correct there isn't really as structure to work with here.

http://www.datastax.com/dev/blog/dse-field-transformers

mando222
  • 543
  • 3
  • 6
  • 16
  • Thanks, mando222, this is even closer to what I am after, unfortunately also much more involved. The docs do have common static elements and they also have dynamic elements with generic names, so I cannot commit to a schema. Unfortunately I have been spoiled by elasticsearch which gladly indexes everything you throw at it :) – kostja Jul 21 '16 at 19:06
  • There is an open source elasticsearch for cassandra. I haven't used it but it may do the trick. https://github.com/vroyer/elassandra – mando222 Jul 21 '16 at 20:45
  • Elassandra looks good, did not know it existed. Thanks for the pointer, mando222 – kostja Jul 22 '16 at 11:15