4

I have a table in MySQL with the following fields:

id, company_name, year, state

There are multiple rows for the same customer and year, here is an example of the data:

    id | company_name  | year | state
----------------------------------------
    1  | companyA      | 2008 | 1
    2  | companyB      | 2009 | 2
    3  | companyC      | 2010 | 3
    4  | companyB      | 2009 | 1
    5  | companyC      | NULL | 3

I am trying to create a view from this table to show one company per row (i.e. GROUP BY pubco_name) where the state is the highest for a given year.

Here is an example of the view I am trying to create:

    id | cuompany_name | NULL | 2008 | 2009 | 2010
--------------------------------------------------
    1  | companyA      | NULL | 1    | NULL | NULL
    2  | companyB      | NULL | 2    | NULL | NULL
    3  | companyC      | 3    | NULL | NULL | 3

There is a lot more data than this, but you can see what I am trying to accomplish.

I don't know how to select the max state for each year and group by pubco_name. Here is the SQL I have thus far (I think we need to use CASE and/or sub-selects here):

SELECT
id,
company_name,
SUM(CASE WHEN year = 2008 THEN max(state) ELSE 0 END) AS 2008,
SUM(CASE WHEN year = 2009 THEN max(state) ELSE 0 END) AS 2009,
SUM(CASE WHEN year = 2010 THEN max(state) ELSE 0 END) AS 2010,
SUM(CASE WHEN year = 2011 THEN max(state) ELSE 0 END) AS 2011,
SUM(CASE WHEN year = 2012 THEN max(state) ELSE 0 END) AS 2012,
SUM(CASE WHEN year = 2013 THEN max(state) ELSE 0 END) AS 2013
FROM tbl
GROUP BY company_name
ORDER BY id DESC

Appreciate your help and thanks in advance.

Ben
  • 1,013
  • 4
  • 16
  • 34
  • 1
    9 times out of 10 issues of data display are best handled at the application level. – Strawberry Apr 05 '13 at 11:59
  • @Strawberry you recommend PHP? – Ben Apr 05 '13 at 12:08
  • I do - but only because that's the only application level code that I know! – Strawberry Apr 05 '13 at 12:30
  • Please check this page for details of dynamic pivot queries using mysql: http://www.boynux.com/creating-pivot-reports-in-mysql/ – Boynux Nov 29 '13 at 03:12
  • I agree in theory with @Strawberry, but I've often found that the real world exception to this is when you have a pretty light, thin APP layer (because most of your business logic is not complicated) and a very beefy DB layer (because you need the resources for some massive tables.) In that case, letting the DB do the heavy lifting for things like pivots will be faster, simply because PHP has 512M of memory to work with and MySQL has 50G. – Cody Crumrine Apr 10 '15 at 13:57

2 Answers2

1

You need to pivot the table but mysql does not have any such functionality of pivot

so we need to replicate its functionality

EDITED

Select 
  group_concat(
    DISTINCT 
       if(year is null,
          CONCAT('max(if (year is null, state, 0)) as ''NULL'' '),
          CONCAT('max(if (year=''', year, ''', state, 0)) as ''',year, ''' '))
    ) into @sql from tbl join (SELECT @sql:='')a;
set @sql = concat('select company_name, ', @sql, 'from tbl group by company_name;');
PREPARE stmt FROM @sql;
EXECUTE stmt;

Result

| COMPANY_NAME | 2008 | 2009 | 2010 | NULL |
--------------------------------------------
|     companyA |    1 |    0 |    0 |    0 |
|     companyB |    0 |    2 |    0 |    0 |
|     companyC |    0 |    0 |    3 |    3 |

SQL FIDDLE

There are 2 approaches to solve your problem 1. create case for each year, which is not possible in your case as we are dealing with year 2. generate the query dynamically so that we get proper columns as per your need.

I have given solution according to the second solution where I am generating the query and storing it in @sql variable. In the fiddle I have printed the contents of @sql before executing it.

select company_name, max(if (year='2008', state, 0)) as '2008' ,max(if (year='2009', state, 0)) as '2009' ,max(if (year='2010', state, 0)) as '2010' ,max(if (year is null, state, 0)) as 'NULL' from tbl group by company_name; 

For more information regarding group_concat() go through the link GROUP_CONCAT and USER DEFINED VARIABLE

Hope this helps..

Meherzad
  • 8,433
  • 1
  • 30
  • 40
  • this looks greate but I'm afraid it's way over my head. I haven't seen half of these operators before.. like @ and group_concat, set, DEALLOCATE PREPARE, etc. but thank you! will study it. Cheers – Ben Apr 05 '13 at 11:27
  • Its ok... :) `@sql` is user defined variable which I have created which stores our query.. I am editing the answer to add more description... – Meherzad Apr 05 '13 at 11:29
  • thanks for the additional detail. Please correct me if I'm wrong, but I don't think the SQL that is ultimately printed before execution is select the the max/highest `state` for each pubco and year... – Ben Apr 05 '13 at 11:47
  • also.. in your fiddle, I would expect the value of `state` for `CompanyC` to be = 3 for both `2010` and `NULL` but they are both = 1. – Ben Apr 05 '13 at 11:56
  • @BenJones if you need a simpler solution then you need to know the number of years from before hand so that you can add case for all those years... – Meherzad Apr 05 '13 at 14:00
  • Note that `GROUP_CONCAT()` will only return a maximum of 1024 characters which may break the query when you have many `year`'s. To increase this limit, execute `SET SESSION group_concat_max_len = 1000000;`. – Lemmings19 Jun 12 '17 at 22:07
0

Please see the page linked in the answer to this question.

Note that when you do this, you must specify ahead of time how many columns you want in your output.

In response to the comment below, here is a simple/ basic implementation that reproduces the result table above (except for the ID column; having it makes no sense, as each row in the result can summarize more than one row in the input table)

SELECT
   `company_name`,
   NULLIF(SUM(CASE WHEN `t3`.`year` IS NULL THEN `t3`.`state` ELSE 0 END), 0) AS `null`,
   NULLIF(SUM(CASE WHEN `t3`.`year` = 2008 THEN `t3`.`state` ELSE 0 END), 0) AS `2008`,
   NULLIF(SUM(CASE WHEN `t3`.`year` = 2009 THEN `t3`.`state` ELSE 0 END), 0) AS `2009`,
   NULLIF(SUM(CASE WHEN `t3`.`year` = 2010 THEN `t3`.`state` ELSE 0 END), 0) AS `2010`
FROM
(
   SELECT
   `t1`.`id`,
   `t1`.`company_name`,
   `t1`.`year`,
   `t1`.`state`
   FROM `tbl` `t1`
   WHERE `t1`.`state` = (
      SELECT MAX(`state`)
      FROM `tbl` `t2`
      WHERE `t2`.`company_name` = `t1`.`company_name`
      AND (`t2`.`year` IS NULL AND `t1`.`year` IS NULL OR `t2`.`year` = `t1`.`year`)
   )
) `t3`
GROUP BY `t3`.`company_name`;

This uses nested queries: the inner ones (with the t1 and t2 aliases) find the row with the maximum state for each year and company (and which will break unless you can be sure that this is unique!), and the outer one t3 does the pivot.

I would test this thoroughly to ensure performance is acceptable on real data.

Community
  • 1
  • 1
Ken Keenan
  • 9,818
  • 5
  • 32
  • 49
  • 1
    It is possible to get the desired output without knowing the number of column from ahead.... – Meherzad Apr 05 '13 at 11:47
  • @Meherzad: I should've been more explicit. I should have said that you need to use dynamic SQL if you want to change the number of columns, as you have done. – Ken Keenan Apr 05 '13 at 13:16
  • @KenKeenan can you please see my update at the bottom of my question. I am trying to use a simple/basic solution. I don't really understand Meherzad's answer and like I said, I don't think it's selecting the max state (see my comments below Meherzad's answer) Thanks! – Ben Apr 05 '13 at 13:25