2

Well I have this records:

  Employee_Number    Employee_role         Group_Name
  ----------------------------------------------------
  EMP101             C# Developer            Group_1      
  EMP102             ASP Developer           Group_1      
  EMP103             SQL Developer           Group_2      
  EMP104             PLSQL Developer         Group_2      
  EMP101             Java Developer          
  EMP102             Web Developer          
  EMP101             DBA          
  EMP105             DBA          
  EMP106             SQL Developer           Group_3      
  EMP107             Oracle Developer        Group_3      
  EMP101             Oracle Developer        Group_3      

Want to show the pivot table for above records in the following format:

 Employee_Number     TotalRoles      TotalGroups       Available     Others     Group_1     Group_2      Group_3
 -----------------------------------------------------------------------------------------------------------------
 EMP101                   4               3                2            2          1                        1
 EMP102                   2               3                1            1          1 
 EMP103                   1               3                1            0                      1
 EMP104                   1               3                1            0                      1
 EMP105                   1               3                0            1
 EMP106                   1               3                1            0                                   1
 EMP107                   1               3                1            0                                   1

For the above result I am using the following script:

SELECT * FROM crosstab(
      $$SELECT grp.*, e.group_name
         , CASE WHEN e.employee_number IS NULL THEN 0 ELSE 1 END AS val
    FROM  (
       SELECT employee_number
        , count(employee_role)::int            AS total_roles
        , (SELECT count(DISTINCT group_name)::int
           FROM   employee
           WHERE  group_name <> '')            AS total_groups
        , count(group_name <> '' OR NULL)::INT AS available                    
        , count(group_name =  '' OR NULL)::int AS others
       FROM   employee
       GROUP  BY employee_number
       ) grp
    LEFT   JOIN employee e ON e.employee_number = grp.employee_number
                  AND e.group_name <> ''         
    ORDER  BY grp.employee_number, e.group_name$$
     ,$$VALUES ('Group_1'),('Group_2'),('Group_3')$$
   ) AS ct (employee_number text
      , total_roles  int
      , total_groups int
      , available    int
      , others       int
      , "Group_1"    int
      , "Group_2"    int
      , "Group_3"    int);

But:Now I want to show the pivot table for the above records by filtering the Group_Name. That means if I want to show the pivot table for the only Group_Name= Group_3 then it has to show only the employee who is only belongs to the Group_Name= Group_3 not other than that.

If I want to see the employee who is belongs to the Group_3 only than it has to show me:

   Employee_Number    total_roles  total_groups   available    others    Group_3
  ------------------------------------------------------------------------------- 
       EMP106             1             3            1            0            1
       EMP107             1             3            1            0            1

Note: As you can see in the first table the employee EMP106 and EMP107 is only belongs to the Group_Name = Group_3. The employee EMP101 is also belong but he also belongs to other groups so should not appear in this table.

MAK
  • 6,824
  • 25
  • 74
  • 131

1 Answers1

2

How to exclude the offending rows:

The crosstab() query adapted:

SELECT * FROM crosstab(
    $$SELECT grp.*, e.group_name
           , CASE WHEN e.employee_number IS NULL THEN 0 ELSE 1 END AS val
      FROM  (
         SELECT employee_number
              , count(employee_role)::int            AS total_roles
              , (SELECT count(DISTINCT group_name)::int
                 FROM   employee
                 WHERE  group_name <> '')            AS total_groups
              , count(group_name <> '' OR NULL)::int AS available
              , count(group_name = '' OR NULL)::int  AS others
         FROM   employee
         GROUP  BY employee_number
         ) grp
      JOIN   employee e USING (employee_number)
      WHERE  e.group_name = 'Group_3'
      AND    NOT EXISTS (
         SELECT 1 FROM employee
         WHERE  employee_number = e.employee_number
         AND    group_name  e.group_name
         )
      ORDER  BY employee_number$$
   ,$$VALUES ('Group_3')$$
   ) AS ct (employee_number text
      , total_roles  int
      , total_groups int
      , available    int
      , others       int
      , "Group_3"    int);

But as you can see, we don't need crosstab() here at all. Simplify to:

SELECT grp.*, 1 AS "Group_3"
FROM  (
   SELECT employee_number
        , count(employee_role)::int            AS total_roles
        , (SELECT count(DISTINCT group_name)::int
           FROM   employee
           WHERE  group_name <> '')            AS total_groups
        , count(group_name <> '' OR NULL)::int AS available
        , count(group_name = '' OR NULL)::int  AS others
   FROM   employee
   GROUP  BY employee_number
   ) grp
JOIN   employee e USING (employee_number)
WHERE  e.group_name = 'Group_3'
AND    NOT EXISTS (
   SELECT 1 FROM employee
   WHERE  employee_number = e.employee_number
   AND    group_name <> e.group_name
   )
ORDER  BY employee_number;

The column "Group_3" is really just noise here, because it is always 1 by definition.

If only a small percentage of rows is selected this way, this version with a LATERAL join should be substantially faster:

SELECT e.employee_number
     , grp.total_roles
     , total.total_groups
     , grp.available
     , grp.others
     , 1 AS "Group_3"
FROM  (
   SELECT employee_number
   FROM   employee e
   WHERE  group_name = 'Group_3'
   AND    NOT EXISTS (
      SELECT 1 FROM employee
      WHERE  employee_number = e.employee_number
      AND    group_name <> e.group_name
      )
   ) e
, LATERAL (
   SELECT count(employee_role)::int            AS total_roles
        , count(group_name <> '' OR NULL)::int AS available
        , count(group_name = '' OR NULL)::int  AS others
   FROM   employee
   WHERE  employee_number = e.employee_number
   GROUP  BY employee_number
   ) grp
,    (
   SELECT count(DISTINCT group_name)::int AS total_groups
   FROM   employee
   WHERE  group_name <> ''
   ) total
ORDER  BY employee_number;

Details for the LATERAL solution and performance:

Simple, generic solution for any set of groups

Not optimized for performance, but easy to adapt:

<original crosstab query from your question>
WHERE  "Group_3" = 1
AND    "Group_1" IS NULL
AND    "Group_2" IS NULL
AND    "Group_4" IS NULL
AND    others = 0  -- to rule out membership in the "empty" group
-- possibly more ...
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    The employee `EMP101` is also getting displayed with `EMP106` and `EMP107` whereas is it should not because he the `EMP101` is also belongs to other groups. – MAK Mar 19 '15 at 14:15
  • Again in the last script also I am getting `EMP101`. – MAK Mar 19 '15 at 14:26
  • If a new employee `EMP108` is add with two different groups `INSERT INTO employee values('EMP108','JSP','Group_4'),('EMP108','JS','Group_5')` the I want to filter the `Group_4` and `Group_5`. It has to show me the employee `EMP108` right? – MAK Mar 19 '15 at 14:36
  • 1
    @MAK: The solutions so far are tailored for a single group. I added a simple, generic solution for any set of groups. – Erwin Brandstetter Mar 19 '15 at 14:57
  • There is still some work remaining. If I filter the `Group_name = Group_1` then according to the table it should not display any employee because of no one is **only** belongs to `Group_1`. But as per your simple,generic solution it's showing employee `EMP102` whereas he has 2 roles, you can see in the table. – MAK Mar 20 '15 at 05:10
  • 1
    @MAK: The second role of `EMP102` belongs to the empty group, which can be ruled out by checking `others = 0`. – Erwin Brandstetter Mar 20 '15 at 05:22