1

I have two tables, one is "transactions" table and one is "Membership " table.

A user can have all available membership active at once (each membership gives different benefits).

I'm trying to handle multiply transaction records with the same user id and the same Membership.

Here's my tables

Transactions table

ID | USERID | MID |      CREATED      | AMOUNT
-------------------------------------------------
 1  |   1   |  2  | 2014-10-01 00:00:00 |   1
 2  |   1   |  2  | 2014-10-16 00:00:00 |   1
 3  |   2   |  1  | 2014-10-30 00:00:00 |   1

Membership tables

ID |    TITLE    |  DURATION
-------------------------
 1 |   Premium   |     365
 2 |  Supporter  |     30
 3 | Beta Access |     30

In the transactions table I have 2 records for the userid 1, one that begins at 2014-10-01 and one at 2014-10-16.

The following script works fine for selecting individual active memberships logs

SELECT t.USERID AS UID, t.CREATED AS CREATED, FROM_UNIXTIME(UNIX_TIMESTAMP(t.CREATED) + t.AMOUNT * m.DURATION) AS ENDS
FROM Transactions AS t
LEFT JOIN Memberships AS m on m.ID = t.MID
LIMIT 5

Output would be this,

UID | MID |       CREATED      | ENDS
-----------------------------------------------------
 1  | 2  | 2014-10-01 00:00:00 | 2014-10-31 00:00:00
 1  | 2  | 2014-10-16 00:00:00 | 2014-11-15 00:00:00
 2  | 1  | 2014-10-30 00:00:00 | 2015-10-30 00:00:00

Now there is two records with the same Membership ID (MID) and user ID (UID), and the first record does expire before the second one.

Basically, what I'm trying to do, 'merge' or combine' the total 'unused' amount of days of a memberishp as long as another membership (same userid and same memebrship id) was added before the current has expired.

(Here's an example to display the data given and the wanted output:)

ID | USERID | MID |      CREATED        | Amount
-------------------------------------------------
 1  |   1   |  2  | 2014-10-01 00:00:00 |   1 #30 days days remains
 2  |   1   |  2  | 2014-10-17 00:00:00 |   1 #14 days of the previous transaction is not fully consumed,43 days remains - (days amount +  previous unused days)
 3  |   1   |  2  | 2014-11-01 00:00:00 |   1 #28 days of the previous transaction (44 days ones) is not fully consumed,59 days remains - (days amount +  previous unused days)
 4  |   2   |  3  | 2014-10-01 00:00:00 |   1 #30 days days remains
 5  |   2   |  3  | 2014-11-08 00:00:00 |   1 #30 days days remains

And output should be this

UID | MID |       CREATED       | ENDS
-----------------------------------------------------
 1  | 2  | 2015-10-01 00:00:00 | 2014-12-29 00:00:00
 2  | 1  | 2014-10-01 00:00:00 | 2014-10-30 00:00:00
 2  | 1  | 2014-11-08 00:00:00 | 2014-12-08 00:00:00

I apologize if not being clear, as English is not my native language, and do not have the words to explain what I'm trying to accomplish.

EDIT: Looking for php solution if not possible via mysql.

Hovo
  • 790
  • 4
  • 21
das
  • 547
  • 3
  • 12
  • in the table with 'wanted output', is it intentional that you use id 2 for the last 4 records? shouldn't that be unique? 2014-10-17=>2014-10-31 is 14 days, so I calculate 44 days for the `2014-10-17` transaction, hence, 59 days for the next one. am I doing something wrong? – lp_ Dec 18 '14 at 17:02
  • @lp_ my bad it should have been unique (the id), and yes it should have been 44 days (calculated by 30 days instead 31 days). Thanks for pointing it out, fixed. Will also try your solution. – das Dec 18 '14 at 17:32

1 Answers1

4

I'd say that there are mainly three different approaches and I try to give examples for those. However, all approaches have advantages/disadvantages too...

MySQL query

It seems that such a query requires some sort of recursion... So a potential approach can be similar to this one, and exploit the expression evaluation of MySQL, by using User-Defined Variables in the select statement.

A query that returns the duration and the (extended) end of each transaction record:

SELECT id, userid, mid, created,
  -- initialize if new userid or membership id
  IF (@lastuser!=userid or @lastmb!=mid, (@prevend:=created)+(@lastuser:=userid)+(@lastmb:=mid), 0) AS tmp,
  -- calculate unused days
  @unused:=IF(@prevend>created, datediff(@prevend, created), 0) AS tmp2, 
  -- calculate the end of current membership (will be used for next record)
  @prevend:=DATE_ADD(created, INTERVAL (amount * duration)+@unused DAY) AS ends,
  -- calculate the days remaining
  @unused+duration AS 'days remain'
FROM (
  SELECT tt.id, tt.userid, tt.mid, tt.created, tt.amount, duration
  FROM transactions tt
  LEFT JOIN memberships as m on m.ID = tt.MID
  ORDER BY tt.userid, tt.created) t
JOIN (SELECT @lastuser:=0)tmp;

The output of this query is:

id  userid mid  created             tmp     tmp2 ends                   days remain
1   1      2    2014-10-01 00:00:00 2017    0    2014-10-31 00:00:00    30
2   1      2    2014-10-17 00:00:00 0       14   2014-11-30 00:00:00    44
3   1      2    2014-11-01 00:00:00 0       29   2014-12-30 00:00:00    59
4   2      3    2014-10-01 00:00:00 2019    0    2014-10-31 00:00:00    30
5   2      3    2014-11-08 00:00:00 0       0    2014-12-08 00:00:00    30

It is still another task, to output just the merged intervals:

SELECT userid, mid, begins, max(ends) as ends FROM (
    SELECT id, userid, mid, created,
      -- initialize if new userid or membership id
      IF (@lastuser!=userid or @lastmb!=mid, (@prevend:=created)+(@lastuser:=userid)+(@lastmb:=mid), 0) AS tmp,
      -- firstcreated stores the very first creation time of overlapping memberships 
      if (@prevend>created, @firstcreated, (@firstcreated:=created)) as begins,
      -- calculate unused days
      @unused:=IF(@prevend>created, datediff(@prevend, created), 0) AS tmp2, 
      -- calculate the end of current membership (will be used for next record)
      @prevend:=DATE_ADD(created, INTERVAL (amount * duration)+@unused DAY) AS ends,
      -- calculate the days remaining
      @unused+duration AS 'days remain'
    FROM (
      SELECT tt.id, tt.userid, tt.mid, tt.created, tt.amount, duration
      FROM transactions tt
      LEFT JOIN memberships as m on m.ID = tt.MID
      ORDER BY tt.userid, tt.created) t
    JOIN (SELECT @lastuser:=0)tmp
) mship 
GROUP BY userid, mid, begins;

Note, however, that this is really not an advisable solution, because the evaluation order of the expressions is not guaranteed. So the query might produce a good result, but with different data set, or with a different MySQL release, it may easily produce a bad result. In the proposed query, there is a subquery with an order by clause, so record order should not become a problem here, but if you want to put this query in a code that you wish to maintain longer, you might get surprised when you migrate to a new version of MySQL, for instance.

At least, it seems to be working on MySQL 5.5 and MySQL 5.6 too.

So caution again, because as the MySQL documentation says:

As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement. For example, to increment a variable, this is okay:

SET @a = @a + 1; For other statements, such as SELECT, you might get the results you expect, but this is not guaranteed. In the following statement, you might think that MySQL will evaluate @a first and then do an assignment second:

SELECT @a, @a:=@a+1, ...;

However, the order of evaluation for expressions involving user variables is undefined.

Calculate everything on client (application) side (e.g. in PHP)

The idea is the same. Fetch transactions ordered by userid, mid and creation date. Iterate over the records and for each new transaction extend the duration of the membership with the 'unused' dates (if there are any), which can be calculated from the previous transaction. When we see that there is a break in the membership, we save the actual period.

A sample PHP code to do that:

<?php
$conn = mysqli_connect("localhost", "user", "password", "db");
if (mysqli_connect_errno($conn)) {
    echo "Failed to connect to MySQL: " . mysqli_connect_error();
}

// Query to select membership information
$res = mysqli_query($conn, "select t.id, userid, mid, created, (m.duration * t.amount) as duration
    from transactions t
    left join memberships m
    on t.mid=m.id
    order by userid, mid, created");

echo "Temporary calculation:<br/>";
echo "<table border=\"1\">";
echo "<th>Id</th><th>UserId</th><th>MID</th><th>Created</th><th>Unused</th><th>End</th><th>Duration</th>";

$last_userid=0;
while ($row = $res->fetch_assoc()) {
    // Beginning of a new userid or membership id
    if ($row['userid']!=$last_userid or $row['mid']!=$last_mid) {
        // If we are not at the first record, we save the current period
        if ($last_userid!=0) {
            $mships[$last_userid][$last_mid][$first_created->format('Y-m-d H:i:s')]=$last_end->format('Y-m-d H:i:s');
        }

        // Initialize temporaries
        $last_userid=$row['userid'];
        $last_mid=$row['mid'];
        $first_created=new DateTime($row['created']);
        $last_end=clone $first_created;
    }

    // Calculate duration
    $created=new DateTime($row['created']);
    $unused=date_diff($created, $last_end);
    $ends=clone $created;
    $ends->add(new DateInterval("P".$row['duration']."D"));

    // $unused->invert is 1 if diff is negative 
    if ($unused->invert==0 && $unused->days>=0) {
        // This transaction extends/immediately follows the previous period
        $ends->add(new DateInterval('P'.$unused->days.'D'));
    } else {
        // We split the period -> save it!
        $mships[$row['userid']][$row['mid']][$first_created->format('Y-m-d H:i:s')]=$last_end->format('Y-m-d H:i:s');
        $first_created=new DateTime($row['created']);
    }

    $duration=date_diff($ends, $created);   

    echo "<tr>";
    echo "<td>",$row['id'],"</td>";
    echo "<td>",$row['userid'],"</td>";
    echo "<td>",$row['mid'],"</td>";
    echo "<td>",$row['created'],"</td>";
    echo "<td>",($unused->invert==0 ? $unused->format('%a') : 0),"</td>";
    echo "<td>",$ends->format('Y-m-d H:i:s'),"</td>";
    echo "<td>",$duration->format('%a'),"</td>";
    echo "</tr>";

    $last_end=$ends;
}
// Last period should be saved
if ($last_userid!=0) {
    $mships[$last_userid][$last_mid][$first_created->format('Y-m-d H:i:s')]=$last_end->format('Y-m-d H:i:s');
}

echo "</table><br/>";

echo "Final array:<br/>";
echo "<table border=\"1\">";
echo "<th>UserId</th><th>MID</th><th>Created</th><th>End</th>";

foreach ($mships as $uid => &$mids) {
    foreach ($mids as $mid => &$periods) {
        foreach ($periods as $begin => $end) {
            echo "<tr>";
            echo "<td>",$uid,"</td>";
            echo "<td>",$mid,"</td>";
            echo "<td>",$begin,"</td>";
            echo "<td>",$end,"</td>";
            echo "</tr>";               
        }
    }
}

$conn->close();

?>

(Honestly, it has been a few years since I last wrote anything in php :) So feel free to reformat or use some nicer solution.)

Output should look like:

Temporary calculation:
Id  UserId  MID Created             Unused  End                 Duration
1   1       2   2014-10-01 00:00:00 0       2014-10-31 00:00:00 30
2   1       2   2014-10-17 00:00:00 14      2014-11-30 00:00:00 44
3   1       2   2014-11-01 00:00:00 29      2014-12-30 00:00:00 59
4   2       3   2014-10-01 00:00:00 0       2014-10-31 00:00:00 30
5   2       3   2014-11-08 00:00:00 0       2014-12-08 00:00:00 30

Final results:
UserId  MID Created             End
1       2   2014-10-01 00:00:00 2014-12-30 00:00:00
2       3   2014-10-01 00:00:00 2014-10-31 00:00:00
2       3   2014-11-08 00:00:00 2014-12-08 00:00:00

MySQL stored procedure

It is also possible to calculate the result set with a stored procedure. E.g. a sample procedure with the same algorithm as the previous PHP code.

DROP PROCEDURE IF EXISTS get_memberships;

delimiter //

CREATE PROCEDURE get_memberships()
BEGIN
  DECLARE done INT DEFAULT FALSE;
  DECLARE uid, mid, duration INT;
  DECLARE created, unused, first_created, ends TIMESTAMP;
  -- make sure that there is no user with 0 id
  DECLARE last_uid, last_mid INT DEFAULT 0;
  DECLARE last_end TIMESTAMP;
  DECLARE cur CURSOR FOR SELECT t.userid, t.mid, t.created, (m.duration * t.amount) as duration
    FROM transactions t
    LEFT JOIN memberships m
    ON t.mid=m.id
    ORDER BY userid, mid, created;
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

  OPEN cur;

  REPEAT
    FETCH cur INTO uid, mid, created, duration;
    IF (!done) THEN
        IF (uid!=last_uid OR last_mid!=mid) THEN
            IF (last_uid!=0) THEN
                INSERT INTO results (userid, mid, created, ends) VALUES (last_uid, last_mid, first_created, last_end);
            END IF;

            SET last_uid = uid;
            SET last_mid = mid;
            SET last_end = created;
            SET first_created = created;
        END IF;

        SET ends = DATE_ADD(created, INTERVAL duration DAY);

        IF (last_end>=created) THEN
            SET ends = DATE_ADD(ends, INTERVAL datediff(last_end, created) DAY);
        ELSE
            INSERT INTO results (userid, mid, created, ends) VALUES (uid, mid, first_created, last_end);
            SET first_created = created;
        END IF;

        SET last_end = ends;
    END IF;
  UNTIL done
  END REPEAT;

  IF (last_uid!=0) THEN
    INSERT INTO results (userid, mid, created, ends) VALUES (uid, last_mid, first_created, last_end);
  END IF;

  CLOSE cur;
END
//

DROP TABLE IF EXISTS results //
CREATE TEMPORARY TABLE results AS SELECT userid, mid, created, created as ends FROM transactions WHERE 0 //
call get_memberships //
SELECT * FROM results //
DROP TABLE results;

However, one disadvantage of this technique is the usage of a temporary table.

Community
  • 1
  • 1
lp_
  • 1,158
  • 1
  • 14
  • 21
  • This looks promising, never knew of User-Defined Variables or though mysql has it (I'm not using mysql on a regular base, just to control users and membeships). – das Dec 18 '14 at 17:37
  • The scripts almost achieves what I want to, but the calculation is incorrect, and cannot understand what goes wrong. – das Dec 18 '14 at 19:19
  • just one thing to clarify. a new membership can extend just the same type of membership, or it does not matter? e.g. if I have a supporter membership which will expire at 12.31. and I buy a beta access today, will my membership last until January 31? or that's two different thing and I still have my supporter ms until 12.31 and a beta access until 01.18? (because the query above will merge these together) – lp_ Dec 18 '14 at 19:55
  • Two different things. – das Dec 18 '14 at 21:51
  • well, that might be a reason, why the query failed. I updated the answer which handles different memberships, and gave it a try to implement the same in php (the idea is the same...). even if it is not perfect, I hope that it helps to guide you towards a working solution. – lp_ Dec 19 '14 at 00:59
  • `**Calculate everything on client side (PHP)**` This is totally wrong! PHP is serverside. – GuyT Dec 19 '14 at 13:49
  • @GuyT is correct. The php solution seems like it would work, just worried about performance (some users has 10 different membership with many record). – das Dec 19 '14 at 16:49
  • @GuyT: from the database point of view, PHP is client side. and the question (at least the original one) is rather about the database side... if it was Java, JS, ..., the point is the same. – lp_ Dec 19 '14 at 16:52
  • I updated the answer with a sample stored procedure, and played a little bit the query. I do not claim that these are perfect solutions, each of them have disadvantages :-( – lp_ Dec 19 '14 at 17:26
  • @lp_ what you provided is more than I needed!. Thank you so much for your efforts, the php part gets what I want, gotta tweak it around, because I fetch lot's of data at once from database every request (about 70k queries per hour just from my application). – das Dec 20 '14 at 02:00
  • The stored procedure sounds great, but I'm worried about the usage of temporary table, when doing 1150 queries per minute (mainly to fetch membership), how likely is it to cause problems? – das Dec 20 '14 at 02:03
  • Nice load... I am sure that the PHP code can be improved, I rather wanted to make the algorithm more understandable, instead of optimizing it. I just used a simple associative array, for example. If you have such a load I'd ask: do you always need to query the membership of all the users? There is a trick, to avoid the temporary table in the SP by preparing a string (e.g. formatted as csv) and return it. But it works, if the result set fits into a string. If you query only one user at once, it might work. – lp_ Dec 20 '14 at 20:55
  • I am not sure which one to recommend, honestly. You can win something with the stored procedure as it minimizes traffic between your db and its clients, or it probably better optimizes the query, cache usage, etc. But you give extra load to your db, which seems to be already heavily loaded... If I were you, I'd definitely perform some tests to support the decision, but you probably already have plans for that. – lp_ Dec 20 '14 at 21:11
  • @lp_ I really can't agree with that. PHP is executed on the server and returns a parsed HTML page. I understand your point(database pov), but this is simply not true. I haven't down-voted your answer, though. OP: FYI you might want to use load balancers to improve stability and performance. – GuyT Dec 22 '14 at 08:41
  • I slightly modified the "client side" header as it was probably confusing. Thanks @GuyT for pointing that out. – lp_ Dec 22 '14 at 09:31