IBM Watson, how to input data of entire books

Question

Im using the IBM Watson analytics trial, it says it only takes data as CSV, Excel and a few others. How can i convert books or bodies of text into an acceptable format? thank you

score 0 · Answer 1 · edited May 23 '17 at 12:26

0

It seems like the architecture of WCA(Watson Context Analytics) does not support PDF itself. Please refer the following images from IBM Link

I think it would be better to convert pdf to text with converter such as CONVERTER and pushing it into database or others. Then, you can crawing the text data from it.

FYI, the document has to have a KEY column (i.e. name of the book).

edited May 23 '17 at 12:26

Community

1
1

answered Apr 25 '17 at 22:56

Kwang-Chun Kang

351
3
12

score 0 · Answer 2 · answered May 19 '17 at 20:31

0

Even if you do convert your book into an acceptable text format (.csv. .xls, .xlsx. .sav), Watson Analytics isn't optimized for text analytics. It sounds like Watson Explorer is the offering that'd best suit your needs.

Hope this helps.

answered May 19 '17 at 20:31

brennan fox

11
2

score 0 · Answer 3 · answered Sep 01 '17 at 05:19

Even though CSV or XLS is the acceptable format of the file, Datasets needs to be in the specific structure. You need headers for all the tables and data following it. I am not sure how a data of the book can fit into that format.

I have recently published this blog post on how to structure and refine data before importing into Watson Analytics to get the best results.

For your specific requirement, you can look into Watson Explorer as suggested by Brennan above, or even better you can learn to use IBM Content Analytics here.

IBM Watson, how to input data of entire books

3 Answers3