Some things you need to be made aware-of:
- The ancient-style of JOIN, where you do
SELECT ... FROM x, y WHERE x.a = y.b
, should not be used. I wish modern RDBMS would block queries using it (outside of any compatibility mode).
- Always use explicit
JOIN
clauses! for the sake of readability and maintainability (while performance shouldn't be different, using explicit JOINs make it far, far easier to investigate performance issues should they occur).
- Don't use the
float
or real
types for representing exact quantities (by "exact" I don't mean integral: you can have exact fractional quantities), instead the decimal
type should be preferred.
- You should always include the Schema Name (default is
dbo.
) when referencing tables in queries as it solves issues relating to ambiguity.
- Your many-to-many linking table
dbo.Placement
also allows duplicates because it doesn't have a PK defined.
- Don't use short, cryptic column names like
rid
, hid
and sid
. Software should be self-documenting. I would name those columns to RouteId
, HoldId
, and SlotId
respectively.
- Don't fall for the mistake of naming a column just
Id
. Column names should not need the name of their parent table to be understandable (this is because queries can/will/do expose your data, often with their original column names, in contexts without their original table names, such as in CTEs, derived-table queries, VIEW
s, etc).
- It's subjective, but I believe the table-names should be plural, not singular (after-all, a table holds multiple rows - I'd only give a table a singular name if that table will only ever hold a single row).
- The worst argument I've heard so far advocating for singular instead of plural is because (apparently) some ORMs and code-gen tools lack the ability to convert a plural noun to a singular noun. Yeesh. That hasn't been true for 20+ years now.
First, to avoid problems caused by using float
types in JOIN
conditions I'll change your dbo.Slot
table to use decimal
:
CREATE TABLE dbo.Slot2 (
sid int NOT NULL,
wall varchar(200) NOT NULL,
x decimal(19,6) NOT NULL, -- 6 decimal places should be enough.
y decimal(19,6) NOT NULL,
CONSTRAINT PK_Slot PRIMARY KEY ( sid ),
-- CONSTRAINT UK_SlotValues UNIQUE ( wall, x, y ) -- This will prevent duplicate values in future.
);
INSERT INTO dbo.Slot2 ( sid, wall, x, y )
SELECT
sid,
wall,
CONVERT( decimal(19,6), x ) AS x2,
CONVERT( decimal(19,6), y ) AS y2
FROM
dbo.Slot;
DROP TABLE dbo.Slot;
EXEC sp_rename 'dbo.Slot2', 'Slot';
With that taken care-of, let's now get the duplicate values in the set of slots (i.e. find the identical wall, x, y
values without other values):
SELECT
wall,
x,
y
FROM
dbo.Slot
GROUP BY
wall,
x,
y
HAVING
COUNT(*) >= 2
Then we do an INNER JOIN
between the original dbo.Slot
table and this set of duplicate values, as well as adding a ROW_NUMBER
value to make it easier to choose a single row to keep if the other duplicates are removed:
WITH duplicateValues (
SELECT
wall,
x,
y
FROM
dbo.Slot
GROUP BY
wall,
x,
y
HAVING
COUNT(*) >= 2
)
SELECT
ROW_NUMBER() OVER ( PARTITION BY s.wall, s.x, s.y ORDER BY s.sid ) AS n,
s.*
FROM
dbo.Slot AS s
INNER JOIN duplicateValues AS d ON
s.wall = d.wall
AND
s.x = d.x
AND
s.y = d.y
In your post you mentioned wanting to also consider the Placement
table, however we need further details because your post doesn't explain how the Placement
table should work.
However your Placement
table should still have a PK. I'm assuming that the Placement
table's HoldId
column is not a key column, so should look like this:
CREATE TABLE dbo.Placement (
RouteId int NOT NULL,
SlotId int NOT NULL,
HoldId int NOT NULL,
CONSTRAINT PK_Placement PRIMARY KEY ( RouteId, SlotId ),
CONSTRAINT FK_Placement_Route FOREIGN KEY ( RouteId ) REFERENCES dbo.Route ( rid ),
CONSTRAINT FK_Placement_Slot FOREIGN KEY ( SlotId ) REFERENCES dbo.Slot ( sid ),
CONSTRAINT FK_Placement_Hold FOREIGN KEY ( HoldId ) REFERENCES dbo.Hold ( hid )
);