5

This question is for my pastebin app written in PHP.

I did a bit of a research, although I wasn't able to find a solution that matches my needs. I have a table with this structure:

+-----------+------------------+------+-----+---------+----------------+
| Field     | Type             | Null | Key | Default | Extra          |
+-----------+------------------+------+-----+---------+----------------+
| id        | int(12) unsigned | NO   | PRI | NULL    | auto_increment |
| author    | varchar(50)      | YES  |     |         |                |
| authorid  | int(12) unsigned | YES  |     | NULL    |                |
| project   | varchar(50)      | YES  |     |         |                |
| timestamp | int(11) unsigned | NO   |     | NULL    |                |
| expire    | int(11) unsigned | NO   |     | NULL    |                |
| title     | varchar(25)      | YES  |     |         |                |
| data      | longtext         | NO   |     | NULL    |                |
| language  | varchar(50)      | NO   |     | php     |                |
| password  | varchar(60)      | NO   |     | NULL    |                |
| salt      | varchar(5)       | NO   |     | NULL    |                |
| private   | tinyint(1)       | NO   |     | 0       |                |
| hash      | varchar(12)      | NO   |     | NULL    |                |
| ip        | varchar(50)      | NO   |     | NULL    |                |
| urlkey    | varchar(8)       | YES  | MUL |         |                |
| hits      | int(11)          | NO   |     | 0       |                |
+-----------+------------------+------+-----+---------+----------------+

This is for a pastebin application. I basically want paste revisions so that if you open paste #1234, it shows all past revisions of that paste.

I thought of three ways:

Method 1

Have a revisions table with id and old_id or something and for each ID, I would insert all old revisions, so if my structure looks like this:

rev3: 1234
rev2: 1233
rev1: 1232

The table will contain this data:

+-------+----------+
| id    | old_id   |
+-------+----------+
| 1234  | 1233     |
| 1234  | 1232     |
| 1233  | 1232     |
+-------+----------+

The problem which I have with this is that it introduces a lot of duplicate data. And the more the revisions get, it has not only more data but I need to do N inserts for each new paste to the revisions table which is not great for a large N.

Method 2

I can add a child_id to the paste table at the top and just update that. And then, when fetching the paste, I will keep querying the db for each child_id and their child_id and so on... But the problem is, that will introduce too many DB reads each time a paste with many revisions is opened.

Method 3

Also involves a separate revisions table, but for the same scenario as method 1, it will store the data like this:

+-------+-----------------+
| id    | old_id          |
+-------+-----------------+
| 1234  | 1233,1232       |
| 1233  | 1232            |
+-------+-----------------+

And when someone opens paste 1234, I'll use an IN clause to fetch all child paste data there.

Which is the best approach? Or is there a better approach? I am using Laravel 4 framework that has Eloquent ORM.

EDIT: Can I do method 1 with a oneToMany relationship? I understand that I can use Eager Loading to fetch all the revisions, but how can I insert them without having to do a dirty hack?

EDIT: I figured out how to handle the above. I'll add an answer to close this question.

duellsy
  • 8,497
  • 2
  • 36
  • 60
Sayak Banerjee
  • 1,954
  • 3
  • 25
  • 58
  • modification of method 1 i will recommend. ID should be `unique`. and add the revisions in other table. – itachi Aug 29 '13 at 05:18
  • Is the revision stored in tree-like structure or linear structure? – invisal Aug 29 '13 at 05:28
  • Lets say that 1232 is the original, 1233 add some changes to 1232 and 1234 adds some changes to 1232. Then, someone directly add change to 1232 without goes through 1234 (called it 1235). What do you want to deal with this case? – invisal Aug 29 '13 at 05:30
  • Which Laravel version is this for? – Abishek Aug 29 '13 at 05:46
  • @invisal: The same way. If 1235 is a new revision for 1232, the id will be 1235 and old_id will be 1232. So when I fetch 1235, it will only show 1232 as a revision. – Sayak Banerjee Aug 30 '13 at 01:54

3 Answers3

8

If you are on Laravel 4, give Revisionable a try. This might suite your needs

Abishek
  • 11,191
  • 19
  • 72
  • 111
  • Thanks, yes I'm on L4, let me try this out. :) – Sayak Banerjee Aug 30 '13 at 01:55
  • 3
    Well, while this is a great solution, it stores the data as old_value and new_value, so I'll always have two copies of new_value, one in my main table and one in the revisions table. I think I might end up writing my own versioning system :( – Sayak Banerjee Aug 30 '13 at 03:08
2

So here is what I am doing:

Say this is the revision flow:

1232 -> 1233 -> 1234
1232 -> 1235

So here is what my revision table will look like:

+----+--------+--------+
| id | new_id | old_id |
+----+--------+--------+
| 1  | 1233   | 1232   |
| 2  | 1234   | 1233   |
| 3  | 1234   | 1232   |
| 4  | 1235   | 1232   |
+----+--------+--------+

IDs 2 and 3 show that when I open 1234, it should show both 1233 and 1232 as revisions on the list.

Now the implementation bit: I will have the Paste model have a one to many relationship with the Revision model.

  • When I create a new revision for an existing paste, I will run a batch insert to add not only the current new_id and old_id pair, but pair the current new_id with all revisions that were associated with old_id.
  • When I open a paste - which I will do by querying new_id, I will essentially get all associated rows in the revisions table (using a function in the Paste model that defines hasMany('Revision', 'new_id')) and will display to the user.

I am also thinking about displaying the author of each revision in the "Revision history" section on the "view paste" page, so I think I'll also add an author column to the revision table so that I don't need to go back and query the main paste table to get the author.

So that's about it!

Sayak Banerjee
  • 1,954
  • 3
  • 25
  • 58
  • When you are simply querying the main table, how do you tell the difference between the current version and older versions of the same data? – Seb Barre Jul 13 '14 at 13:46
  • @SebBarre I never have a scenario where the revisions table will point to two identical entries. That's because I compare the crc32 hash for the new (POSTed) data with the last revision and create a new revision only if the data is different. See: https://github.com/sayakb/sticky-notes/blob/master/app/controllers/CreateController.php#L245 – Sayak Banerjee Jul 13 '14 at 22:13
1

There are some great packages to help you keeping model revisions:

  • If you only want to keep the models revisions you can use Revisionable

  • If you also want to log any other actions, whenever you want, with custom data, you can use Laravel Activity Logger

Honorable mentions:

Ben Everard
  • 13,652
  • 14
  • 67
  • 96