0

I have Mysql table something like:

|DocumentID|                 Documents                           
============================================
0                   Penny bought bright blue fishes.
1                   Penny bought bright blue and orange fish.
2                   The cat ate a fish at the store.
3                   Penny went to the store. Penny ate a bug. Penn...
4                   It meowed once at the bug, it is still meowing...
5                   The cat is at the fish store. The cat is orang...
6                   Penny is a fish

Now, all I want to create a new table where columns are the unique words in all documents and rows as value corresponding to each DocumentID which is equals to

(number of times word appears in sentence) / (number of words in sentence).

Something like:-

 DocumentID    ate      blue        bought       bright     bug         cat   fish            meow           once      orang      penni     saw           store            went
0             0.000000  0.200000    0.200000    0.200000    0.000000    0.000   0.200000    0.000000    0.000000    0.000000    0.200000    0.000000    0.000000    0.000000
1             0.000000  0.166667    0.166667    0.166667    0.000000    0.000   0.166667    0.000000    0.000000    0.166667    0.166667    0.000000    0.000000    0.000000
2             0.250000  0.000000    0.000000    0.000000    0.000000    0.250   0.250000    0.000000    0.000000    0.000000    0.000000    0.000000    0.250000    0.000000
3             0.111111  0.000000    0.000000    0.000000    0.111111    0.000   0.111111    0.000000    0.000000    0.000000    0.333333    0.111111    0.111111    0.111111
4             0.000000  0.000000    0.000000    0.000000    0.333333    0.000   0.166667    0.333333    0.166667    0.000000    0.000000    0.000000    0.000000    0.000000
5             0.000000  0.000000    0.000000    0.000000    0.000000    0.375   0.250000    0.125000    0.000000    0.125000    0.000000    0.000000    0.125000    0.000000
6             0.000000  0.000000    0.000000    0.000000    0.000000    0.000   0.500000    0.000000    0.000000    0.000000    0.500000    0.000000    0.000000    0.000000

I have tried alot but not getting expected results.

Learner
  • 800
  • 1
  • 8
  • 23
  • 1
    Could you elaborate on what you tried and where specifically you are stuck? E.g. don't you get the correct values or do you have problems generating those columns (which is btw a bad idea, but you can of course do it, at least until you hit the technical limitations for the column number). – Solarflare Sep 03 '18 at 11:54
  • @Solarflare I'm not getting how to create those columns. – Learner Sep 04 '18 at 05:11
  • This is a pivot. You would need to use dynamic sql for this, see e.g. [here](https://stackoverflow.com/q/12598120/6248528) (with `create table` and `insert`/`update` instead of a `select`). I would *strongly* suggest though that you store your data in the form `(word, documentId, tf)`, and (if at all) use the pivot only when you display them. Otherwise, your next question on stackoverflow might be: "I have this table, how can I query for the word with the highest tf?" with a *very* complicated answer and the tip to simply store them as `(word, documentId, tf)`. – Solarflare Sep 04 '18 at 07:51
  • @Solarflare can you please explain it using code. As, I have already gone through the link which you have mentioned above – Learner Sep 04 '18 at 08:28
  • Well, can *you* show the code you tried using the code in the link, so we can correct it (and see your data structure, e.g. how you extract words from sentences as a prerequirement to create the columns) - and not expect us to write the complete code for you (which is not what stackoverflow is for, especially since it is, as mentioned, actually the wrong approach, which limits the usefulness as a reference for future readers with the same problem)? – Solarflare Sep 04 '18 at 09:26

0 Answers0