3

I am trying to convert a view from an Oracle RDBMS to SQL Server. The view looks like:

create or replace view user_part_v
as
  select part_region.part_id, users.id as users_id
    from part_region, users
   where part_region.region_id in(select     region_id
                                        from region_relation
                                  start with region_id = users.region_id
                                  connect by parent_region_id = prior region_id)

Having read about recursive CTE's and also about their use in sub-queries, my best guess at translating the above into SQL Server syntax is:

create view user_part_v
as
  with region_structure(region_id, parent_region_id) as (
    select region_id
         , parent_region_id
      from region_relation
     where parent_region_id = users.region_id
    union all
    select r.region_id
         , r.parent_region_id
      from region_relation r
      join region_structure rs on rs.parent_region_id = r.region_id
  )
  select part_region.part_id, users.id as users_id
    from part_region, users
   where part_region.region_id in(select region_id from region_structure)

Obviously this gives me an error about the reference to users.region_id in the CTE definition.

How can I achieve the same result in SQL Server as I get from the Oracle view?

Background

I am working on the conversion of a system from running on an Oracle 11g RDMS to SQL Server 2008. This system is a relatively large Java EE based system, using JPA (Hibernate) to query from the database.

Many of the queries use the above mentioned view to restrict the results returned to those appropriate for the current user. If I cannot convert the view directly then the conversion will be much harder as I will need to change all of the places where we query the database to achieve the same result.

The tables referenced by this view have a structure similar to:

USERS
  ID 
  REGION_ID

REGION
  ID
  NAME

REGION_RELATIONSHIP
  PARENT_REGION_ID
  REGION_ID

PART
  ID
  PARTNO
  DESCRIPTION

PART_REGION
  PART_ID
  REGION_ID

So, we have regions, arranged into a hierarchy. A user may be assigned to a region. A part may be assigned to many regions. A user may only see the parts assigned to their region. The regions reference various geographic regions:

World
  Europe
    Germany
    France
    ...
  North America
    Canada
    USA
      New York
      ...

If a part, #123, is assigned to the region USA, and the user is assigned to the region New York, then the user should be able to see that part.

UPDATE: I was able to work around the error by creating a separate view that contained the necessary data, and then have my main view join to this view. This has the system working, but I have not yet done thorough correctness or performance testing yet. I am still open to suggestions for better solutions.

Community
  • 1
  • 1
David Sykes
  • 7,131
  • 4
  • 36
  • 39
  • for XML.. syntax. or similar to http://stackoverflow.com/questions/12869048/hierarchy-query-sql-server-2008 – xQbert Mar 08 '13 at 03:22
  • 1
    xQbert, thanks for your comment. I know about how to create a hierarchical query in SQL Server using CTE's. I would very much like to convert my view as is, to avoid having to modify large parts of the system when they query the database. I have edited the question to hopefully make this clearer. – David Sykes Mar 08 '13 at 03:34
  • I think you will need to build the whole tree in the cte and then limit the result in the sub-select for the `IN` clause. Btw, the join in the recursive part of the CTE is wrong: it should be `join region_structure` not `join region_relation` –  Mar 08 '13 at 08:19
  • Thanks a_horse_with_no_name, I have fixed the CTE. And yes, I eventually went and created another view containing all of the necessary data and joined to it from my main query. – David Sykes Mar 10 '13 at 23:38

1 Answers1

2

I reformatted your original query to make it easier for me to read.

create or replace view user_part_v
as
select part_region.part_id, users.id as users_id
from part_region, users
where part_region.region_id in(
    select region_id
    from region_relation
    start with region_id = users.region_id
    connect by parent_region_id = prior region_id
);

Let's examine what's going on in this query.

select part_region.part_id, users.id as users_id
from part_region, users

This is an old-style join where the tables are cartesian joined and then the results are reduced by the subsequent where clause(s).

where part_region.region_id in(
    select region_id
    from region_relation
    start with region_id = users.region_id
    connect by parent_region_id = prior region_id
);

The sub-query that's using the connect by statement is using the region_id from the users table in outer query to define the starting point for the recursion. Then the in clause checks to see if the region_id for the part_region is found in the results of the recursive query. This recursion follows the parent-child linkages given in the region_relation table.

So the combination of doing an in clause with a sub-query that references the parent and the old-style join means that you have to consider what the query is meant to accomplish and approach it from that direction (rather than just a tweaked re-arrangement of the old query) to be able to translate it into a single recursive CTE.

This query also will return multiple rows if the part is assigned to multiple regions along the same branch of the region heirarchy. e.g. if the part is assigned to both North America and USA a user assigned to New York will get two rows returned for their users_id with the same part_id number.


Given the Oracle view and the background you gave of what the view is supposed to do, I think what you're looking for is something more like this:

create view user_part_v
as
with user_regions(users_id, region_id, parent_region_id) as (
    select u.users_id, u.region_id, rr.parent_region_id
    from users u 
    left join region_relation rr on u.region_id = rr.region_id
    union all
    select ur.users_id, rr.region_id, rr.parent_region_id
    from user_regions ur
    inner join region_relation rr on ur.parent_region_id = rr.region_id
)
select pr.part_id, ur.users_id
from part_region pr
inner join user_regions ur on pr.region_id = ur.region_id;

Note that I've added the users_id to the output of the recursive CTE, and then just done a simple inner join of the part_region table and the CTE results.

Let me break down the query for you.

select u.users_id, u.region_id, rr.parent_region_id
from users u 
left join region_relation rr on u.region_id = rr.region_id

This is the starting set for our recursion. We're taking the region_relation table and joining it against the users table, to get the starting point for the recursion for every user. That starting point being the region the user is assigned to along with the parent_region_id for that region. A left join is done here and the region_id is pulled from the user table in case the user is assigned to a top-most region (which means there won't be an entry in the region_relation table for that region).

select ur.users_id, rr.region_id, rr.parent_region_id
from user_regions ur
inner join region_relation rr on ur.parent_region_id = rr.region_id

This is the recursive part of the CTE. We take the existing results for each user, then add rows for each user for the parent regions of the existing set. This recursion happens until we run out of parents. (i.e. we hit rows that have no entries for their region_id in the region_relationship table.)

select pr.part_id, ur.users_id
from part_region pr
inner join user_regions ur on pr.region_id = ur.region_id;

This is the part where we grab our final result set. Assuming (as I do from your description) that each region has only one parent (which would mean that there's only one row in region_relationship for each region_id), a simple join will return all the users that should be able to view the part based on the part's region_id. This is because there is exactly one row returned per user for the user's assigned region, and one row per user for each parent region up to the heirarchy root.

NOTE:

Both the original query and this one do have a limitation that I want to make sure you are aware of. If the part is assigned to a region that is lower in the heirarchy than the user (i.e. a region that is a descendent of the user's region like the part being assigned to New York and the user to USA instead of the other way around), the user won't see that part. The part has to be assigned to either the user's assigned region, or one higher in the region heirarchy.

Another thing is that this query still exhibits the case I mentioned above about the original query, where if a part is assigned to multiple regions along the same branch of the heirarchy that multiple rows will be returned for the same combination of users_id and part_id. I did this because I wasn't sure if you wanted that behavior changed or not.

If this is actually an issue and you want to eliminate the duplicates, then you can replace the query below the CTE with this one:

select p.part_id, u.users_id
from part p
cross join users u
where exists (
    select 1
    from part_region pr
    inner join user_regions ur on pr.region_id = ur.region_id;
    where pr.part_id = p.part_id
    and ur.users_id = u.users_id
);

This does a cartesian join between the part table and the users table and then only returns rows where the combination of the two has at least one row in the results of the subquery, which are the results that we are trying to de-duplicate.

Kanmuri
  • 1,105
  • 8
  • 11
  • Thanks for your reply. This project got cancelled not long after I made managed to find a workaround. I am marking your answer as correct, even though I am not in a position to test it, since it looks extremely thorough, and is the only answer. Thanks again. – David Sykes Jul 12 '17 at 14:48