0

I have a table of yearly data for various countries.

Not all countries have the same number of years of data.

If there is no row for a country/year pair, the implication is that the value is 0.

I'm trying to build a query that will use a JOIN to add years and 0 for rows that don't yet exist.

current code:

CREATE TABLE years SELECT DISTINCT acq_year FROM final_data;

SELECT * FROM final_data CROSS JOIN years;
Hugh_Kelley
  • 988
  • 1
  • 9
  • 23
  • 2
    You are on the correct path,you only need to create the missing years,something like JOIN SELECT 1960 UNION SELECT 1961 UNION... It can be done procedurally if you have many years or directly fill a table with all the years you need.An example https://stackoverflow.com/a/10132142/1745672 – Mihai May 15 '19 at 12:22
  • 1
    Do you have a calendar/dates table? How many years are we talking about, from when to when? – P.Salmon May 15 '19 at 12:22
  • 1870 to 2017 so ~150 years. It's years only, not monthly or daily. @Mihai, that's what I thought I could do by creating a table with `DISTINCT` no? – Hugh_Kelley May 15 '19 at 12:27
  • 1
    Yes, but it assumes you have all years.And running a DISTINCT query each time can be expensive – Mihai May 15 '19 at 12:28
  • 1
    The query should look something like `SELECT year,SUM(whatver) FROM original RIGHT JOIN calendar ON calendar.year = orignal.year GROUP BY year` – Mihai May 15 '19 at 12:31
  • oh ok, the years don't need to be continuous, just the same set of years for each country, adding a 0 value where a year/country pair doesn't exist yet – Hugh_Kelley May 15 '19 at 12:32

1 Answers1

1

I am assuming your table of yearly data is named final_data with columns country, acq_year and data:

    CREATE TEMPORARY TABLE years SELECT DISTINCT acq_year FROM final_data;

    CREATE TEMPORARY TABLE countries_years
        SELECT fd1.country, fd1.acq_year FROM
            (SELECT DISTINCT country, years.acq_year FROM final_data join years) fd1
            LEFT JOIN final_data fd2 ON fd1.country = fd2.country AND fd1.acq_year = fd2.acq_year
            WHERE fd2.acq_year IS null;

    INSERT INTO final_data(country, acq_year, data)
        SELECT country, acq_year, 0 FROM countries_years;
Booboo
  • 38,656
  • 3
  • 37
  • 60