I'm completely new to SQL, and have read StackOverflow posts on SQL to try and figure this out, and other sources and unable to do this in SQL. Here goes...
I have a table of 3 columns and thousands of rows, with data for first 2 columns. The third column is currently empty and I need to populate the third column based on data already in the first and second columns.
Say I have states in the first column and fruit entries in the second column. I need to write an SQL statement(s) that calculates the number of different states where each fruit comes from, and then inserts this popularity number into the third column for every row. A popularity number of 1 in that row means that fruit only comes from one state, a popularity number of 4 means the fruit comes from 4 states. So my table is currently like:
state fruit popularity
hawaii apple
hawaii apple
hawaii banana
hawaii kiwi
hawaii kiwi
hawaii mango
florida apple
florida apple
florida apple
florida orange
michigan apple
michigan apple
michigan apricot
michigan orange
michigan pear
michigan pear
michigan pear
texas apple
texas banana
texas banana
texas banana
texas grape
And I need to figure out how to calculate and then update the third column, named popularity, which is the number of states that exports that fruit. The goal is to produce (sorry bad pun) the table below, where based on above table, "apple" appears in all 4 states, oranges and banana appear in 2 states, and kiwi, mango, pear, and grape only appear in 1 state, hence their corresponding popularity numbers.
state fruit popularity
hawaii apple 4
hawaii apple 4
hawaii banana 2
hawaii kiwi 1
hawaii kiwi 1
hawaii mango 1
florida apple 4
florida apple 4
florida apple 4
florida orange 2
michigan apple 4
michigan apple 4
michigan apricot 1
michigan orange 2
michigan pear 1
michigan pear 1
michigan pear 1
texas apple 4
texas banana 2
texas banana 2
texas banana 2
texas grape 1
My small programmer brain says to try and figure out a way to loop through the data in some kind of script, but reading up a little on SQL and databases, it seems like you don't write long and slow looping scripts in SQL. I'm not even sure if you can? but instead that there are better/faster ways to do this in SQL.
Anyone know how to, in SQL statement(s), calculate and update the third column for each row, which is here called popularity and corresponds to the number of states that each fruit comes from? Thanks for reading, very grateful for any help.
So far I have tried these SQL statements below, which output but don't quite get me what I need:
--outputs those fruits appearing multiple times in the table
SELECT fruit, COUNT(*)
FROM table
GROUP BY fruit
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
--outputs those fruits appearing only once in the table
SELECT fruit, COUNT(*)
FROM table
GROUP BY fruit
HAVING COUNT(*) = 1
--outputs list of unique fruits in the table
SELECT COUNT (DISTINCT(fruit))
FROM table