I get stuck by counting hashtags with HiveQL. My problem: I have these format of the hashtags in one row:
jurassicworld;movie;night;dino
jurassicWorld;book;yourtickets;movie
jurassicWorld;movie
I looked at the https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF from Hive, but there is no function, which i can choose a delimiter (;) to seperat these hashtags and count them.
my result should be look like this:
+---------------+-----------+
| Hashtag | Count |
+---------------+-----------+
| jurassicworld | 300 |
| movie | 200 |
| night | 100 |
| dino | 250 |
| book | 50 |
| etc... | 100 |
+---------------+-----------+