0

I'm pulling data from Twitter API into my DB. There is a column 'hashtags' which stores a list of hashtags used in the tweet.

Table name: brexittweets
Column: hashtags varchar(500)

I want to count the number of hashtags. For example

                 Hashtags
Tweet1: ['EUref', 'Brexit', 'poll']

Tweet2: ['Brexit', 'Blair']

Tweet3: ['Brexit', 'Blair', 'EUref']

Result should be:

hashtag     count(hashtag)
Brexit      3
EUref       2
Blair       2
poll        1

What I was thinking of doing: Tried to take substring between quotes ' ', but it occurs multiple times in the same row.

Tried using strpos to find instances of ' ', but it returns only the first instance.

Is there a way to do this with queries? I was thinking of trying out a procedure, but it gets complicated because I need to print these results on a web page using PHP.

2 Answers2

1

If only you've normalized your table such that each tag in a tweet gets stored on its own row, your problem would be solved easily by using COUNT with GROUP BY.

ultrajohn
  • 2,527
  • 4
  • 31
  • 56
0

Assuming all the tags are separated by ', ', you can do the following:

SELECT 
    hashtags, 
    ROUND (   
        (
            LENGTH(hashtags)
            - LENGTH( REPLACE ( hashtags , "', '", "") ) 
        ) / 4) + 1
     AS count
    from brexittweets

Here's the SQL Fiddle.

Darshan Mehta
  • 30,102
  • 11
  • 68
  • 102
  • Thank you. But does not work, because this just returns the number of hashtags in each record. I need count of each hashtag in the database. So for instance 'poll' should return a count of 1, which this query does not. – Varun B. Krishnan Feb 19 '17 at 11:27