2

How can I generate a list of words from a MySQL/MariaDB table using PHP?

I have the following table:

id Title
1  A brief history of time
2  philosophy of ecucation
3  introduction to education
4  philosophy and astrophysics
5  astrophysics: astrophysics for dummies

What I am trying to accomplish is to have MySQL (or PHP) generate a result showing how often a word is used. Preferably each title should increase the counter for a word only by 1 (see id 5). My database contains roughly 10000 titles so fetching them all into one array in php should not be feasible.

count word
1     brief
2     philosophy
2(!)  astrophysics <-- this counts only once
1     introduction
etc.
SagarPPanchal
  • 9,839
  • 6
  • 34
  • 62
kabauter
  • 99
  • 8
  • 3
    What have you tried so far? What code do you have that generates the count table you've posted? – vvanasten May 05 '14 at 13:35
  • I generated the count table manually just as an example. This is how the result should look like. I could not think of any other solution but to fetch each row individually (with PHP) and then have PHP generate a word list. I think that would be time consuming as the whole database must be fetched by the client. Maybe there is a way of accessing the index of the table with MySQL. As I am just beginning to learn MySQL/PHP this is not a trivial task to me. – kabauter May 05 '14 at 14:18
  • Related to (but not duplicate of) [Count the frequency of each word](http://stackoverflow.com/questions/22410458/count-the-frequency-of-each-word). I'll think of a solution that ignores "duplicate" words. – RandomSeed May 05 '14 at 16:44

2 Answers2

0

Try the following query:

SELECT count(*) FROM books WHERE Title LIKE '%astrophysics%';
dotancohen
  • 30,064
  • 36
  • 138
  • 197
  • That would work if I queried for specific words. However I need to explore which words are the most popular as I do not know the words to query beforehand. – kabauter May 05 '14 at 13:47
0

This isn't an easy task. You first need to retrieve all words out of your texts, and to do this, you need a kind of separate index table (lookup table) of id,word with no primary key, probably with "substring_id" field signifying the position of the word in the title of the book id id. Then you can do like this:

SELECT DISTINCT word, count(id) as count from lookup
GROUP BY word
ORDER BY word

To form this table, you need to have a consistency check (a trigger, likely) on insert, update, delete, that will alter the lookup table updating info about text and id. Use this question Can you split/explode a field in a MySQL query? as source on how to retrieve a list of words out of a string (aka split by space).

Community
  • 1
  • 1
Vesper
  • 18,599
  • 6
  • 39
  • 61