1

This seems simple but I'm banging my head against the wall to figure it out.

I've got a single table of Loan data that I want to pivot and summarize for each month the payments were due. This is similar to a "static pool analysis" with each pool of loans for rows and the months as columns. You can see the sample data using this SQL Fiddle.

I've created a 90 second screencast to better explain the pivot and summary results I need if that helps. Thanks for your help!

Damon
  • 21
  • 2

1 Answers1

2

If it doesn't need to be dynamic, using a CASE WHEN or IF statement within the SUM should function as a PIVOT in MySQL:

SELECT 
  PoolMonth, 
  SUM(OriginationAmt) AS Origination,
  SUM(IF(PmtDue = 201512, AmtPaid, 0)) AS `201512`,
  SUM(IF(PmtDue BETWEEN 201512 AND 201601, AmtPaid, 0)) AS `201601`,
  SUM(IF(PmtDue BETWEEN 201512 AND 201602, AmtPaid, 0)) AS `201602`
FROM
  Loans
GROUP BY
  PoolMonth;

http://sqlfiddle.com/#!9/47701/16

This is a bit verbose to get the total OriginationAmt, the running total by PoolMonth/PmtDue, and only output the latest running totals, without hardcoding anything, but here we go :-)

SELECT
  t.PoolMonth,
  t.TtlOriginationAmt,
  t.PmtDue,
  t.RtAmtPaid
FROM
  (
    SELECT 
      l.PoolMonth,
      l.OriginationAmt,
      orig.TtlOriginationAmt,
      l.PmtDue,
      /* Row_Number() equivalent for MySQL http://blog.sqlauthority.com/2014/03/09/mysql-reset-row-number-for-each-group-partition-by-row-number/ */
      /* Assign a Row Number for each Payment Due month for the individual Pool month in ascending order (ORDER BY clause important in this subquery) */
      @RowNumber := CASE WHEN @PoolMonth = l.PoolMonth AND @PmtDue = l.PmtDue THEN @RowNumber + 1 ELSE 1 END AS PoolPmtRowNumber,
      /* Use the total count of PmtDue month rows for each PoolMonth so we can limit our results to the final row */
      lr.PoolPmtLastRow,
      l.AmtPaid,
      /* Running total of Amount Paid for the individual Pool month in order of Payment Due month (ORDER BY clause important in this subquery) */
      @RtAmtPaid := CASE WHEN @PoolMonth = l.PoolMonth THEN @RtAmtPaid + l.AmtPaid ELSE l.AmtPaid END AS RtAmtPaid,
      /* Keep track of the Pool month we're totalling */
      @PoolMonth := l.PoolMonth,
      /* Keep track of the Payment Due month we're ordering */
      @PmtDue := l.PmtDue
    FROM
      Loans l
    JOIN
      /* Get the Total Origination Amount */
      (SELECT PoolMonth, SUM(OriginationAmt) AS TtlOriginationAmt FROM Loans GROUP BY PoolMonth) orig ON orig.PoolMonth = l.PoolMonth
    JOIN
      /* Get the total number of records by Pool/Payment due month so we can filter to the last row */
      (SELECT PoolMonth, PmtDue, COUNT(1) AS PoolPmtLastRow FROM Loans GROUP BY PoolMonth, PmtDue) AS lr ON lr.PoolMonth = l.PoolMonth AND lr.PmtDue = l.PmtDue
    CROSS JOIN
      /* Reset the variables we need for tracking */
      (SELECT @RtAmtPaid:=0,@PoolMonth:=0,@PmtDue:=0,@RowNumber:=0) var
    /* Order by Pool/Payment Due month */
    ORDER BY
      l.PoolMonth,
      l.PmtDue
  )t
WHERE
  /* Filter to the last row */
  t.PoolPmtRowNumber = t.PoolPmtLastRow;

http://sqlfiddle.com/#!9/47701/45

From there, it should be really easy to pivot your results in Excel or anywhere else.

vanlee1987
  • 181
  • 1
  • 7
  • Thank you for taking time to answer. Yes, this works, however, as you noted my scenario is dynamic and the months in the PmtDue column are not the same which is why the pivot on that column is required. – Damon Feb 11 '16 at 04:29
  • I'll update my answer in a bit, but you're essentially looking for this: http://stackoverflow.com/questions/12004603/mysql-pivot-row-into-dynamic-number-of-columns I typically avoid dynamic SQL in a query, since it's hard to read and I can typically move that kind of dynamic pivoting into the application layer (Excel, Tableau, Web Application, etc) more clearly. – vanlee1987 Feb 11 '16 at 15:53
  • I'm actually ok with using Excel to pivot the data as long as it can be summarized the same way. Excel will allow me to add filters on the data later. – Damon Feb 11 '16 at 18:18
  • This should get you an output that is much easier to work without having to specify the ranges individually. – vanlee1987 Feb 12 '16 at 01:15