Comment system design

Question

Here is my current comment system design:

enter image description here

I'm developing it for a website that has lots of areas, blogs, tutorials, manuals etc etc. As supposed to developing a separate comment table for each (tblBlogComments, tblTutorialComments) etc etc, I'm trying to go for a one structure fits all approach.

This way, I can turn the comment system into a web control, and just drop it on any page that I want comments for. It means I only have one set of rules, one set of code files to maintain.

The only problem is, is coming up with a 'nice' way to determine which section (blog/tutorial/manual) belongs to.

For example, one solution would be:

tblComment
-------------
Section (int)
SectionIdentifier (int)

Where 'Section' maps to a unique to each part of the site, EG:

Blog = 1
Articles = 2
Tutorials = 3
...

A SectionIdentifier is some sort of unique ID for that page, eg:

ViewBlog.aspx?ID=5

This would be section 1, identifier 5. So now, a comment with Section = 1, SectionIdentifier = 5 means it's a comment for blog entry number 5.

This works great, but at the cost of maintainability, and a solid structure, as the SectionIdentifier is anonymous and no relationships can be built.

Is this design OK, or is there a better solution (IE some sort of parent table for a comment?)

@Martin, what do you mean? I'm not really sure how else to approach this problem :-( Section identifier might actually be more semantic, I'll update question — Tom Gullen, Apr 15 '11 at 14:30
It is a joke... IdentifierIdentifier - just mind-boggling. :-) — Martin Liversage, Apr 15 '11 at 14:33
@Martin oh ok :) Yeah this is one of those instances also where naming can get a bit ambiguous which I why I'd rather an alternative solution — Tom Gullen, Apr 15 '11 at 14:34

Jeffrey L Whitledge · Accepted Answer · 2011-04-15T15:04:38.203

In Codd's original designed for the Relational Model, a foreign key could reference multiple primary keys in different tables, and the referential integrity was valid if any one table contained the value.

Unfortunately, SQL is a pale reflection of that original vision, since it does not provide this ability, as you have noted.

One standard work-around is to create a new relation that holds the keys to all of the others. But that's not a very good solution in this case, since it creates a point of contention if lots of inserts are happening at once.

The way I would handle this is to create a value—let’s call it a Comment-Anchor—that you can put into every table that is to have comments. This value (unlike all the other keys in a well-designed database) should be a GUID. Then each comment can have a Comment-Anchor that indicates which value it is in reference to.

By making it a GUID, you can always insert unique values in your blog or tutorial or whatever, without contention. You do not have to maintain a master-list of Comment-Anchors anywhere, and no section contends with or is blocked by any other section.

This will work well for the normal use-case of finding all the comments for a single blog entry, for example. To go the other way, from comment to the thing that is being commented on, you could put a flag in the comment table identifying which table is being refrenced, but I wouldn't do that. I would just search all the tables, maybe with a view or something. The reverse query would be rare enough, that I don't see much point in maintaining infrastructure for it, and the flag would be redundant data, which is the bane of RDBMSs.

One additional benifit of this system is that it is easily extensible. If you create a new type of data, or decide to add comments to an existing type of data, then you need only add the Comment-Anchor column to the table. No additional work must be done on the database side. And even the middleware portion that handles the comments does not need to be modified in any way, since it has no knowledge of what sorts of things take comments.

This would be a nice compromise between what I'm suggesting and willy-nilly undefined foreign key references. If you don't want to/can't/doesnt-make-sense to create an extensible framework for your sections, this would be a reasonable hack. It's kind of hokey, but probably the best workable hack to choose from. — Pete M, Apr 15 '11 at 14:58
This looks like a good solution, thanks for the suggestion! Anchoring them this way looks like the best solution definitely. — Tom Gullen, Apr 15 '11 at 15:08

score 2 · Answer 2 · answered Apr 15 '11 at 14:39

2

For a table design, I would model it as closely as possible to what the class structure seems to be in this case. From what you have said, this is what it looks like (roughly):

Section <- Post <- Comment

So, you'd have:

a Section table (eg. blog, articles, tutorials, etc.)
a Post table (for the individual posts in each section)
a Comment table (for the comments on each post)

Each post would have a reference to it's section, and each comment would have a reference to it's post. The DB could have the references as nice, clean foreign keys, and the classes could have lists on one or both sides of the relationships as your app needs them.

To me, that seems like a nice, simple, flexible structure that doesn't complicate things and still allows you to hang extra bits like edits and votes off of it.

answered Apr 15 '11 at 14:39

cdeszaq

30,869
25
117
173

Agreed -- section of code that the comment applies to can be derived from the "owning" post. Adding `Section` to the table would be adding redundant data, which might be a useful denormalization step IF you often go to the Comments tables to process data for comments from a given Section (as opposed to a single post). – Philip Kelley Apr 15 '11 at 14:45
1

@Philip Kelley - The benefit in denormalizing by pulling the section id into the table would be so small in all but the most massive of scales that the integrity and maintenance issues are far more costly than the performance benefit of that denormalization. That said, you are correct in that it _could_ be done :) – cdeszaq Apr 15 '11 at 14:48
Ok, so would you say something like having a table `tblBlogComment` with fields `Comment ID, Blog Entry ID` be a better solution? Is this sort of what you meant? Each section has a table that joins it to comments? – Tom Gullen Apr 15 '11 at 14:53

Pete M · Answer 3 · 2011-04-15T15:04:58.080

1

I would steer clear of creating an id column that defines a different relationship depending on another column in the same table. For instance, in your example SectionIdentifier could represent any number of foreign key references depending on the value of Section. That skeeves me out on general principle. It also leaves several benefits of modern RDBMS platforms on the table since it's not supported.

How is your general architecture for these different sections layed out? I've worked with a few CMS's that would require each of your sections to share a common base entity, calling it a "module" or "plug in". Each instance of a given module then has it's own id, which is used to map to any content required by that specific instance.

If this is a viable architecture direction for you, you could also use that ModuleInstanceID as the foreign key for your comments. You'd just have to decide how you register a given type of module/plug in as being a valid target for comments.

Anyway, can you shed a little light on how your sections are put together under the hood?

edited Apr 15 '11 at 15:04

answered Apr 15 '11 at 14:49

Pete M

2,008
11
17

Thanks Pete, it's all custom built, most sections are similar structure, tblBlogEntry.ID, tblTutorial.ID but then possible we'll want comments on user profiles, so tblUser.ID, etc. I can't think of any instance where there isn't basically just an ID field, all the sections are in different tables though, they don't inherit anything from each other. – Tom Gullen Apr 15 '11 at 14:51
Who is the target audience? Would users regularly be creating their own blogs/tutorials/etc? How will this system be used? – Pete M Apr 15 '11 at 14:56
Users will eventually be generating a lot of content yes, tutorials definitely, blogs possibly. System is going to be very open and used by everyone. – Tom Gullen Apr 15 '11 at 14:58
Will there ever be anything other than tutorials/blogs/articles? – Pete M Apr 15 '11 at 14:59
Yeah, new sections will be added at some point down the line, IE, manual, possibly separate user blog table, arcade games etc. It's going to be a pretty large site (current one already has a lot of visitors but is very basic). – Tom Gullen Apr 15 '11 at 15:00
Ok, I would strongly, STRONGLY recommend taking a look at creating a concept of "modules" and "module instances" in that case. That will be your extensibility point when it comes time to add new sections. Now for the $LargeSum question... Why roll your own? Any reason not to use a pre-existing .NET CMS that has already solved these problems for you? – Pete M Apr 15 '11 at 15:03
Modules could be done but I find they actually interfere with development as they aren't as flexibile. Doing my own CMS for 2 reasons, 1. It gives me total control over content, every single element on the site needs to interact in sometimes quite complex and custom ways (also I want total control and maintainability over output for SEO etc) and 2. I'm relatively inexperienced with .net (but not in webdev years experience in classic ASP) and want to learn ASP.net which I do best when I put it into practise. – Tom Gullen Apr 15 '11 at 15:07
In what ways have you encountered flexibility issues? It may be a terminology thing... In this context, "module" merely means "something which can be put on a page and do arbitrary things". SEO output can be a bit thorny, I'll give you that one. I would argue you'd see more benefit learning ASP.NET by being ***forced*** into best practices against an existing, proven framework. If nothing else, I'd say download a couple and see how they solved these issues. What I"m suggesting is probably the most common design cue you'll come across... for good reason. ;o) – Pete M Apr 15 '11 at 15:10
Inflexibility as in a lot of module tables that attempt to enclose something arbitrary actually hinder development, and make changing some higher level stuff next to impossible. I'm really quite anti CMS to be honest, most applications of them seem to look 'prebuilt' if that's a fair comment I don't know... For a lot of people I'm sure they are right, but for me on this one they aren't, time isn't a constraint luckily. This question is really about a database design issue, I feel I have a solid understanding of database design but this is one problem I haven't really come across before. – Tom Gullen Apr 15 '11 at 15:14
That was a very vague description of inflexibility. You do realize you're creating a CMS, right? Forget I used that acronym... Logically, you have a lot of different "things" that users can choose to create and populate on a whim. There may be some more types of "things" that you create in the future. This is an entity that exists, but you have not modeled. That's why you have a database design issue. If you don't want to accurately model what's really going on, you have to resort to a hack like @cdeszaq provided. It will do the trick but it's going to hurt when it comes time to extend. – Pete M Apr 15 '11 at 15:23
I'm confused, what exactly have I done wrong in regards to the rest of the design? I have a blog, so I have a table tblBlogEntry. I have an article, I have a tblArticle. I don't really need anything else. I know I'm making a CMS, I meant in regards to prebuilt ones is all :) – Tom Gullen Apr 15 '11 at 15:33
Forgot to mention, the key concept here is ALL of your "things" share a common feature: They are all discrete instances of a bit of functionality that exist on a specific page within your site. Everything else is arbitrary. "enclose something arbitrary" in this context is a synonym for encapsulation, which is a core tenant of object oriented design. One of the biggest reasons to do this is to make changing higher level stuff possible and painless. I am terribly confused by your objection without examples. – Pete M Apr 15 '11 at 15:34
BlogEntry, Article, Tutorial, Manual, Game, etc all share a common set of features: A specific instance of them exists on a specific page, with a specific set of attributes/content. You haven't modeled this, but now want to avail yourself of this commonality, and it hurts... – Pete M Apr 15 '11 at 15:38
We're going off on a tangent here, but can you give an example of how better I could design a database which holds blog entries and tutorials then? I've been doing it the tblBlogEntry, tblArticle, tblTutorial way for a long time without many scalability problems or headaches. Anything I think of overcomplicates something fundamentally very simple, and attempts to encapsulate future modules where the data structure isn't even known. – Tom Gullen Apr 15 '11 at 15:40
@pete, if you don't know what structure the game will take, how can you say it shares similar attributes/content with the other tables? What sort of attributes are we talking about? There may be one or two, emphasis on *maybe*, and they are such simple elements most of the time it serves me better to simply include them in the table as supposed to introducing new tables to the design. – Tom Gullen Apr 15 '11 at 15:42
That's because you haven't tried to tie them together before. Encapsulating future models so you have a common way of referencing "things on a page" where data structure isn't known is the whole point. That's the benefit of your chosen medium. I'll grab a sample in sec here. – Pete M Apr 15 '11 at 15:42
@Pete, thanks, I am interested in hearing and learning :) I just don't see how it's possible to predict future structures is all when I think they are going to vary a lot. – Tom Gullen Apr 15 '11 at 15:43
Here you go: http://i55.tinypic.com/2r4lyww.jpg It's a quick mock up but should illustrate the idea. ModuleInstance becomes the key tie-in for all of the stuff you use to compose teh site. Module would contain entries for Blog, Article, Comments, etc. In your case, it just provides a common way to deal with "stuff" that has arbitrary data structures. You could view this as a supertype/subtype scenario if you want to think of it as a purely data modeling issue. Things like security, page, etc could be the supertype if you'd like. The subtype is the actual meat of the module. – Pete M Apr 15 '11 at 16:32
@Pete M - I am not sure that "thing" is a proper target of an object model is-a relationship, which seems to be what you are advocating (blog is-a thing, comment is-a thing, etc.). Why would a series of interfaces (ISecurable, ICommentable, IUserCreatable, etc.) not be a better approach? Why burdon pieces with functionality they don't need? It is not clear to me that Modules containing ModuleInstances is general enough for this type of system. For example, comments could be applied to a Blog or a blog entry. They could apply to an Article, an Article Series, a Publication, etc. – Jeffrey L Whitledge Apr 15 '11 at 17:01
Basically, couldn't a Module contain heterogenous content, some of which could, themselves, be modules to any level of nesting? And couldn't some ModuleInstances be instances of multiple Modules? And if so, then how could a common supertype contain enough functionality to make it worthwhile, and yet be general enough to support these scenerios? – Jeffrey L Whitledge Apr 15 '11 at 17:04
The supertype in this case is the attributes required to uniquely identify or secure an instance, no more. A module instance would never be an instance of multiple modules. If we needed to reference other instances for a particular module type (for example, what's going on with comments), we can dip into ModuleInstances again. You make a strong point on comments applying to either a blog or a blog entry. If that was a requirement then I wouldn't recommend this model as there is no common reference point. Each module would need to know how to deal with the comments tables on their own. – Pete M Apr 15 '11 at 17:12
As to the multiple interface question, why would I want to be continually forced to implement those details? Why establish just a contract when we can perform the exact same logic on the exact same data store across the board? I don't think security and placement can be escaped by something that would be a candidate for a module. If there's no security, then "public" gets full access. If it isn't UserCreatable, then we can secure creation rights with a role, or perhaps it isn't a candidate for a "module" in the first place. – Pete M Apr 15 '11 at 17:18
An interface doesn't force the implementation of the details. A security object could be a private member that handles the details of the ISecurable interface, with a minimum of plumbing. This is code sharing through aggregation rather than inheritence, and it solves a lot of the basic problems with inheritence (especially in a single-inheritence language), by allowing the maximum flexibility. – Jeffrey L Whitledge Apr 15 '11 at 18:04
That's exactly what an interface does, by definition. It forces an object to implement the details of the interface. That is its sole purpose for existing. How would you use ISecurable here? I would use a standard membership provider. I don't want you to completely circumvent it. What benefit did you provide in exchange for blowing site-security completely out of the water? – Pete M Apr 15 '11 at 18:33
@Pete, thanks for your time and diagram, I understand the model, but I think it's overly complicated for my requirements, and too difficult to maintain and write code for (lots of joins etc). I think my model is a lot simpler, and easier to maintain and scale, but I guess that's where we differ, but it might just come down to technique maybe. – Tom Gullen Apr 15 '11 at 19:25
No problem. There is a lot of merit in @Jeffrey's solution for sheer simplicity. He and I differ on which solution is more durable/flexible in the long haul, but they're both quite viable. – Pete M Apr 15 '11 at 19:27
@Pete M - I'm afraid I don't know what you are taking about. "I don't want you to completely circumvent it." "blowing site-security completely out of the water" What are you saying? – Jeffrey L Whitledge Apr 15 '11 at 19:38
If you put the onus for security on the objects that are to be secured as opposed to using something like the Membership provider, you provide a way for a class to decide it doesn't want to use the provider at all. Maybe it just decides to return "yes, I'm good to go", even though it shouldn't be displayed to the user according to role based security used everywhere else. It's creating an open hole for each class to decide what "secure" means, which usually doesn't work in the context of a web application. – Pete M Apr 15 '11 at 19:42
Regarding interfaces, we are using the word "details" to mean two different things. But in any event, proper separation of concerns in the object model means that such interfaces would be few and their implementations simple. If an Article or a Tutorial have much functionality in them at all—whether it's farmed out to a private member (as I advocate) or reimplemented independantly to satisfy an interface or whether it's inherited from a supertype—then they are probably doing too much anyway. – Jeffrey L Whitledge Apr 15 '11 at 19:42
I see. When I proposed an ISecurable interface, I imagined it as the type which the security subsystem would accept, and it's members would probably include nothing more than an identity. – Jeffrey L Whitledge Apr 15 '11 at 19:44
If the heavy-lifting of the security system is put into a supertype, then one must be very careful about subtypes overriding some part of the functionality in dangerous ways. But, even worse, an Blog entry shouldn't be doing anything about security. That is not its core compentancy. – Jeffrey L Whitledge Apr 15 '11 at 19:46
Another benefit of interfaces over inheritence is the fact that your security subsystem can define the ISecurable interface itself. If the security system is expecting a Module or ModuleInstance, then it takes a dependancy on those types. That is an inflexible, and mixes concerns, and it is not a good design. – Jeffrey L Whitledge Apr 15 '11 at 19:49
A blog entry wouldn't be a module in the actual object model. Modules are more like containers than base classes. DNN/Sitefinity present some good examples of the architecture I'm (possibly poorly) trying to explain. Every "container" on the page has some shared stuff, who can see it, who can change permissions, etc. Those attributes are assigned to a module instance. – Pete M Apr 15 '11 at 19:52
Think of it like a file system... The files on a file system are WILDLY different in composition and capability, but they can all be "opened" and all have some common attributes managed in a well defined way. They have a location, a name, a set of permissions, etc. We modify those attributes the same no matter what kind of file it is or how we want to use it. – Pete M Apr 15 '11 at 19:53
Also, I would expect whatever is responsible for loading modules to know how to interact with a security provider, not the other way around. That's the basic pattern behind the ASP.NET Membership Provider. Also: Holy smokes, 30 comments! ;o) – Pete M Apr 15 '11 at 19:54
Well, a Blog shouldn't be dealing with things outside of its core competency any more than a blog entry should. I trust that the frameworks that you are advocating have implemented everything correctly for the model that they use. But if the OP wants to distinguish their system with something else—not based on the containers/instance model—then I think that would be pretty cool too. – Jeffrey L Whitledge Apr 15 '11 at 19:58
The file system metaphore is a good illustration. According to the CreateFile method, a "file" is a file, file stream, directory, physical disk, volume, console buffer, tape drive, communications resource, mailslot, or pipe. The implementations of those things would be completely different. If I were to design an OO file system like this, I would not inherit from a File object. The CreateFile method would take no dependancy on any class, and its assembly would publish the IFile interface for each type to use. And those types can be in whatever other inheritence hierarchy they want. – Jeffrey L Whitledge Apr 15 '11 at 20:12

score 0 · Answer 4 · answered Apr 15 '11 at 15:24

It seems that. Your comment system consists of many kind of comment (tblBlogComments, tblTutorialComments..... etc). I would like to suggest you to adopt strategy design pattern.

Let's say. You have a IComment interface. And All kind of comment class implements IComment interface.

interface IComment
{
    int ID {get; set; }
    int Section  {get; set; }
    ....
}

class BlogComment : IComment
{
    ....
}

class TutorialComment : IComment
{
    ....
}

And a WebControl which only knows how to deal with IComment

class WebControl
{
    IComment _comment = null;

    public WebControl(IComment comment)
    {
        _comment = comment;
    }
}

Of course you need a CommentCreater to loads comment data from database and builds the comment object.

public static void main()
{
    var creater = new CommentCreater();
    IComment comment1 = creater.CreateBlogComment()
    WebControl webcontrol = new WebControl(comment1);
    ......

    IComment comment2 = creater.CreateTutorialComment()
    webcontrol = new WebControl(comment2);
    ........
}

That way your web control can just treats all kind of comment in the same way. No matter exactly what kind of comment it is. and you can also just maintain the CommentCreater to build each kind of comment class correctly.

Comment system design

4 Answers4

Linked