1

I'm using SQL Server 2014. I have a structure like this:

Id BIGINT,
ItemName NVARCHAR(4000),
RecordDate DATETIME2,
Supplier NVARCHAR(450),
Quantity DECIMAL(18, 2),
ItemUnit NVARCHAR(2000),
EntityUnit NVARCHAR(2000),
ItemSize DECIMAL(18, 2),
PackageSize DECIMAL(18, 2),
FamilyCode NVARCHAR(20),
Family NVARCHAR(500),
CategoryCode NVARCHAR(20),
Category NVARCHAR(500),
SubCategoryCode NVARCHAR(20),
SubCategory NVARCHAR(500),
ItemGroupCode NVARCHAR(20),
ItemGroup NVARCHAR(500),
PurchaseValue DECIMAL(18, 2),
UnitPurchaseValue DECIMAL(18, 2),
PackagePurchaseValue DECIMAL(18, 2),
FacilityCode NVARCHAR(450),
CurrencyCode NVARCHAR(5)

I'd like to select distinct ItemNames from BatchRecords table paired with the max Id among the items with the same ItemName as well as Supplier, Quantity and other values of the item with the max Id for each ItemName. So far, I came up with the following SP, definitely it doesn't work yet as GROUP BY throws an error.

I could probably use a subquery, but then how do I satisfy the condition with max Ids for each unique ItemName? Also, any input to the stored procedure quality/obvious bottleneck is highly appreciated as it has to be somewhat quick.

CREATE PROCEDURE dbo.GetRecordsPageFlat 
     (@BatchIds dbo.GenericIntArray READONLY,
      @FileRequestId INT,
      @PageSize INT,
      @PageCount INT,
      @LastId BIGINT,
      @NameMaskValue NVARCHAR(128) = NULL,
      @NameMaskType INT = NULL,
      @FamilyCodeMaskValue NVARCHAR(128),
      @CategoryCodeMaskValue NVARCHAR(128),
      @SubCategoryCodeMaskValue NVARCHAR(128)
     )
AS
    SET NOCOUNT ON;

    DECLARE @Temp dbo.RecordImportStructure
    DECLARE @ErrorCode INT
    DECLARE @Step NVARCHAR(200)
    DECLARE @Rows INT

    --OUTPUT @@ROWCOUNT
    --OUTPUT INSERTED.Id
    INSERT INTO @Temp (
        Id,
        ItemName,
        Supplier,
        Quantity,
        ItemUnit,
        EntityUnit,
        ItemSize,
        PackageSize,
        PurchaseValue,
        UnitPurchaseValue,
        PackagePurchaseValue,
        CurrencyCode
    )
    SELECT
        BR.Id,
        BR.ItemName,
        BR.Supplier,
        BR.Quantity,
        BR.ItemUnit,
        BR.EntityUnit,
        BR.ItemSize,
        BR.PackageSize,
        BR.ItemGroup,
        BR.UnitPurchaseValue,
        BR.PackagePurchaseValue,
        C.IsoCode
    FROM   
        dbo.BatchRecords BR
    LEFT OUTER JOIN 
        dbo.FacilityInstances F ON F.Id = BR.FacilityInstanceId
    LEFT OUTER JOIN 
        dbo.Currencies C ON C.Id = BR.CurrencyId
        --OPTION(RECOMPILE)
    WHERE 
        BR.DataBatchId IN (SELECT * FROM @BatchIds)
        AND BR.Id > @LastId
        AND (@FamilyCodeMaskValue IS NULL OR BR.FamilyCode = @FamilyCodeMaskValue)
        AND (@CategoryCodeMaskValue IS NULL OR BR.CategoryCode = @CategoryCodeMaskValue)
        AND (@SubCategoryCodeMaskValue IS NULL OR BR.SubCategoryCode = @SubCategoryCodeMaskValue)
        AND (@NameMaskType IS NULL AND @NameMaskValue IS NULL 
             OR ((@NameMaskType = 1 AND BR.ItemName LIKE @NameMaskValue + '%')
             OR (@NameMaskType = 2 AND BR.ItemName LIKE '%' + @NameMaskValue)
             OR (@NameMaskType = 3 AND BR.ItemName LIKE '%' + @NameMaskValue + '%')
            ))
    GROUP BY 
        BR.ItemName
    ORDER BY 
        BR.Id
        OFFSET @PageCount * @PageSize ROWS
        FETCH NEXT @PageSize ROWS ONLY;

    UPDATE dbo.BatchActionRequests
    SET PageNumber = @PageCount+1,
    LatestItemId = (SELECT MAX(Id) FROM @Temp)
    WHERE Id = @FileRequestId
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
HardLuck
  • 1,497
  • 1
  • 22
  • 43

2 Answers2

1
;WITH CTC 
AS
(
    SELECT MAX(BR.ID) AS Id, BR.ItemName 
    FROM dbo.BatchRecords BR
        LEFT OUTER JOIN dbo.FacilityInstances F ON F.Id = BR.FacilityInstanceId
        WHERE BR.DataBatchId IN (SELECT * FROM @BatchIds)
        AND BR.Id > @LastId
        AND (@FamilyCodeMaskValue IS NULL OR BR.FamilyCode = @FamilyCodeMaskValue)
        AND (@CategoryCodeMaskValue IS NULL OR BR.CategoryCode = @CategoryCodeMaskValue)
        AND (@SubCategoryCodeMaskValue IS NULL OR BR.SubCategoryCode = @SubCategoryCodeMaskValue)
        AND (@NameMaskType IS NULL AND @NameMaskValue IS NULL 
             OR ((@NameMaskType = 1 AND BR.ItemName LIKE @NameMaskValue + '%')
             OR (@NameMaskType = 2 AND BR.ItemName LIKE '%' + @NameMaskValue)
             OR (@NameMaskType = 3 AND BR.ItemName LIKE '%' + @NameMaskValue + '%')
            ))
        GROUP BY 
        BR.ItemName
)
INSERT INTO @Temp (
        Id,
        ItemName,
        Supplier,
        Quantity,
        ItemUnit,
        EntityUnit,
        ItemSize,
        PackageSize,
        PurchaseValue,
        UnitPurchaseValue,
        PackagePurchaseValue,
        CurrencyCode
    )
SELECT  BR.Id,
        BR.ItemName,
        BR.Supplier,
        BR.Quantity,
        BR.ItemUnit,
        BR.EntityUnit,
        BR.ItemSize,
        BR.PackageSize,
        BR.ItemGroup,
        BR.UnitPurchaseValue,
        BR.PackagePurchaseValue,
        C.IsoCode
    FROM CTC t 
    JOIN dbo.BatchRecords BR ON t.Id = BR.Id
    LEFT OUTER JOIN dbo.Currencies C ON C.Id = BR.CurrencyId
    ORDER BY BR.Id
    OFFSET @PageCount * @PageSize ROWS
    FETCH NEXT @PageSize ROWS ONLY;
D Mayuri
  • 456
  • 2
  • 6
1

It looks like a top-n-per-group problem.

There are two common approaches to it: using ROW_NUMBER and CROSS APPLY. Here is the ROW_NUMBER variant. See Get top 1 row of each group for details.

WITH
CTE
AS
(
    SELECT
        BR.Id,
        BR.ItemName,
        BR.Supplier,
        BR.Quantity,
        BR.ItemUnit,
        BR.EntityUnit,
        BR.ItemSize,
        BR.PackageSize,
        -- BR.ItemGroup,???
        BR.UnitPurchaseValue,
        BR.PackagePurchaseValue,
        C.IsoCode AS CurrencyCode,
        ROW_NUMBER() OVER (PARTITION BY BR.ItemName ORDER BY BR.Id DESC) AS rn
    FROM   
        dbo.BatchRecords BR
        LEFT OUTER JOIN dbo.FacilityInstances F ON F.Id = BR.FacilityInstanceId
        LEFT OUTER JOIN dbo.Currencies C ON C.Id = BR.CurrencyId
    WHERE 
        BR.DataBatchId IN (SELECT * FROM @BatchIds)
        AND BR.Id > @LastId
        AND (@FamilyCodeMaskValue IS NULL OR BR.FamilyCode = @FamilyCodeMaskValue)
        AND (@CategoryCodeMaskValue IS NULL OR BR.CategoryCode = @CategoryCodeMaskValue)
        AND (@SubCategoryCodeMaskValue IS NULL OR BR.SubCategoryCode = @SubCategoryCodeMaskValue)
        AND (@NameMaskType IS NULL AND @NameMaskValue IS NULL 
             OR ((@NameMaskType = 1 AND BR.ItemName LIKE @NameMaskValue + '%')
             OR (@NameMaskType = 2 AND BR.ItemName LIKE '%' + @NameMaskValue)
             OR (@NameMaskType = 3 AND BR.ItemName LIKE '%' + @NameMaskValue + '%')
            ))
)
INSERT INTO @Temp (
    Id,
    ItemName,
    Supplier,
    Quantity,
    ItemUnit,
    EntityUnit,
    ItemSize,
    PackageSize,
    -- PurchaseValue,???
    UnitPurchaseValue,
    PackagePurchaseValue,
    CurrencyCode
)
SELECT
    Id,
    ItemName,
    Supplier,
    Quantity,
    ItemUnit,
    EntityUnit,
    ItemSize,
    PackageSize,
    -- PurchaseValue,???
    UnitPurchaseValue,
    PackagePurchaseValue,
    CurrencyCode
FROM CTE
WHERE rn = 1
ORDER BY 
    Id
    OFFSET @PageCount * @PageSize ROWS
    FETCH NEXT @PageSize ROWS ONLY
OPTION(RECOMPILE);

For each ItemName the query will pick the row with the largest Id.

Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • Answer from @d-mayuri is as good as this one, but marking this as an accepted answer because this soluton works ~3% faster – HardLuck Jun 18 '18 at 23:25