2

I'm trying to build a query that measures "trending" hash tags like tweets. I have a table tweets and a column message (varchar)

Basically I just want to group the messages by a hash tag but it breaks down if there is more than 1 hash tag in the message field, heres what I have so far

SELECT count(message) as count, `message` FROM `tweets` WHERE `message` RLIKE '^#[[:alnum:]]' GROUP BY `tweets`.`message` ORDER BY `count` DESC

which will output the following rows

"4","#sushi"
"3","#coffee"
"3","#wine"
"1","#coffee #park"
"1","#drinking"
"1","#steak"
"1","#pizza"
"1","#pasta"
"1","#food"

The fourth row is the problem where I would want both hash tags totaled on their own, not sure if I can achieve this with just mysql.

Starx
  • 77,474
  • 47
  • 185
  • 261
Brian
  • 4,328
  • 13
  • 58
  • 103
  • Can there be more than two hash tags - is it unlimited? – sgeddes May 06 '13 at 00:51
  • You are going to have to split the column [other post that might help](http://stackoverflow.com/questions/1096679/can-mysql-split-a-column) – mconlin May 06 '13 at 02:03
  • yes you can achieve this by referencing your table to its own. First you have to get the ``count`` and ``message``. Then try to reference fetch the 3rd column using the same query. Anyway, what is the 3rd column? – Mark May 06 '13 at 02:31
  • The values are stored as a single value, How can they be differentiated as different hash tags? This design will make things unnecessarily complex. – Starx May 06 '13 at 03:43

1 Answers1

0

Your column is message but you want to group by hashtags. From a database design POV, your table isn't in 1NF because message actually holds a repeating group of hash tags. You won't be able to use the power of SQL on your table until that's corrected.

What you need is another table hashtags that has one row for every tag in every message. Once you have that, the query practically writes itself.

James K. Lowden
  • 7,574
  • 1
  • 16
  • 31