4

The ERP system we're migrating to requires csv files with 5,000 or less rows for the GL. The debit and credit transactions within each file must balance to zero. There are multiple debit and credit transaction rows that share a common transaction ID.

Using offset and fetch next I've been able to extract 5000 rows at a time, however the credits and debits do not balance.

Data Example:

TranID  Credit  Debit   Balance Account#
1       250     0       250     ABC
1       0       250     0       DEF
2       0       100     -100    GHI
2       50      0       -50     JKL
2       50      0       0       MNO


declare @batchsize INT = 5000,
    @loopcount INT = 0;

while (@loopcount*@batchsize < (select  count(*) from [Apps2].[dbo].[GLTrans]))
begin
  SELECT *  FROM [Apps2].[dbo].[GLTrans]
  ORDER BY tranID
  offset (@batchsize * @loopcount) rows
  fetch next (@batchsize) rows only

  set @loopcount= @loopcount + 1
end
Dale K
  • 25,246
  • 15
  • 42
  • 71
Tomas789
  • 43
  • 3
  • So each transaction sums to zero? So the problem is how to find as many rows as possible up to 5000 without splitting a transaction? – Rup Feb 19 '19 at 00:40
  • Yes Rup. You're correct, as many rows as possible up to 5000 without splitting a transaction (tranid). – Tomas789 Feb 19 '19 at 00:40
  • Compute a running total into a temp table along with row number and id and then find the highest row number less than 5000 where the running total is zero? – Dale K Feb 19 '19 at 00:47
  • https://stackoverflow.com/questions/860966/calculate-a-running-total-in-sql-server – Dale K Feb 19 '19 at 00:57
  • 3
    Select 5000 rows and then discard those with "last" transaction ID. – SMor Feb 19 '19 at 02:24
  • Use common table expression to get your desired result. – DarkRob Feb 19 '19 at 05:36
  • can we assume the total of all debits and credits per TranID would equal 0? – Tim Mylott Feb 19 '19 at 15:20

2 Answers2

4

A simple solution is pre-process all the transactions and assign a batch no (for each CSV files). The temp table stored the number of lines per TranID.

It is assumed that the Debit & Credit will balance for each TranID.

After that you can generate the CSV based on the temp table.

-- create the temp table
create table #trans
(
    TranID      int identity,
    Cnt         int,
    Batch       int
)

-- populate the temp table
insert into #trans (TranID, Cnt)
select TranID, Cnt = count(*)
from   [Apps2].[dbo].[GLTrans]
group by TranID

declare @batchsize  int = 5000,
        @batch      int = 1

while   exists (select * from #trans where Batch is null)
begin
    update  t
    set     Batch   = @batch 
    from
    (
        select  *, cumm = sum(Cnt) over (order by TranID)
        from    #trans
        where   Batch   is null
    ) t
    where   cumm    <= @batchsize 

    select  @batch = @batch + 1
end

--  Verify
select  *, sum(Cnt) over (partition by Batch order by TranID)
from    #trans 
order by TranID 
Squirrel
  • 23,507
  • 4
  • 34
  • 32
  • Squirrel, This worked for my initial pass. My users would like to have a sub-batch within the batch of 5000 we created that splits the transactions into groups of 500 or less. I've tried reusing the code above within it's self to do this but keep running into errors. Any insight on how you might accomplish this? – Tomas789 Mar 01 '19 at 18:43
  • don't quite understand your requirement for the sub-batch. Perhaps you can start a new question and post your query – Squirrel Mar 02 '19 at 07:30
3

Use a table variable to iterate through your data. Kinda like using a cursor in Oracle...

If I'm understanding your sample data correctly and my assumption that each transID set nets to 0, you change your loop logic to function more like a do...while like this example here where you grab the next transaction set and decide if it keeps the batch under 5k. This should cover populating one batch of 5000 rows or fewer rows that net to $0 assuming that each transaction ID set nets to $0

Declare @batchCursor TABLE (
    TransID INT,
    Credit INT, -- chose int for expediency 
    Debit INT,
    Balance INT,
    AccountNo Varchar(4)

),

@batchsize INT = 5000,
@rowCount INT = 0,
@transID INT = 1,
@transSize INT = 0;

while (@rowcount <= 5000)
BEGIN
    INSERT INTO @batchCursor
    SELECT * FROM [Apps2].[dbo].[GLTrans] -- you might need to enumerate all your column names
    WHERE TransID = @transID;

    SELECT @transSize = COUNT(*) FROM @batchCursor where TransID = @transID);

    IF(@transSize > 0)
        BEGIN
        IF (@transSize + @rowCount < @batchSize)
           BEGIN
           Set @rowCount += transSize;
           Set @transID += 1;
           END;
        END;
    ELSE Set @transID += 1;

IF((Select count(*) FROM [Apps2].[dbo].[GLTrans] WHERE TransID = @transID) + @rowCount > @batch)
BREAK;

END;
cod3-jr
  • 88
  • 6