39

For a website having users. Each user having the ability to create any amount of, we'll call it "posts":

Efficiency-wise - is it better to create one table for all of the posts, saving the user-id of the user which created the post, for each post - OR creating a different separate table for each user and putting there just the posts created by that user?

Yuval A.
  • 5,849
  • 11
  • 51
  • 63
  • 6
    Search this site with the terms `[database] "table for each"` for a long list of different takes on this questions. – Mat Sep 25 '11 at 08:59
  • 1
    if the issue is getting the posts for a given user quickly, use an index on `posts(user_id)` or the like. if you have an schema and some sample queries that you need to run quickly, it would be better to tell us that schema and those queries and ask us what indexes should exist. – Dan D. Sep 25 '11 at 09:05

6 Answers6

48

The database layout should not change when you add more data to it, so the user data should definitely be in one table.

Also:

  • Having multiple tables means that you have to create queries dynamically.

  • The cached query plan for one table won't be used for any other of the tables.

  • Having a lot of data in one table doesn't affect performance much, but having a lot of tables does.

  • If you want to add an index to the table to make queries faster, it's a lot easier to do on a single table.

Guffa
  • 687,336
  • 108
  • 737
  • 1,005
13

Well to answer the specific question: In terms of efficiency of querying, it will always be better to have small tables, hence a table per user is likely to be the most efficient.

However, unless you have a lot of posts and users, this is not likely to matter. Even with millions of rows, you will get good performance with a well-placed index.

I would strongly advise against the table-per-user strategy, because it adds a lot of complexity to your solution. How would you query when you need to find, say, users that have posted on a subject within the year ?

Optimize when you need to. Not because you think/are afraid something will be slow. (And even if you need to optimize, there will be easier options than table-per-user)

driis
  • 161,458
  • 45
  • 265
  • 341
  • 2
    I disagree with "always" - give me a count of all posts per user. Writing a UNION query that will aggregate that for you will not be fun, nor will it be efficient. – Aaron Bertrand Oct 11 '14 at 21:16
  • You could use a View for this – malhal Jan 10 '20 at 23:09
  • 1
    "Optimize when you need to. Not because you think/are afraid something will be slow." loved it! – OhhhThatVarun Aug 29 '20 at 11:02
  • Is there a such thing as over optimizing? I'd think optimizing when things are small would be a good idea, instead of waiting for things to get large and then decide you need to optimize and have to refactor a lot of code/queries. I think starting out with an optimized approach would be a better idea. – c0dezer019 May 21 '22 at 03:27
  • @c0dezer019, there is such a thing as *premature* optimization. Most people don't need a shed for their bicycles, and they won't get the remodeling of their house done if they're busy [bike-shedding](https://en.wikipedia.org/wiki/Law_of_triviality). Always measure first and go after proven bottlenecks. – MeetTitan Feb 09 '23 at 19:08
8

Schemas with a varying number of tables are, generally, bad. Use one single table for your posts.

Mat
  • 202,337
  • 40
  • 393
  • 406
5

If performance is a concern, you should learn about database indexes. While indexes is not part of the SQL standard, nearly all databases support them to help improve performance.

I recommend that you create a single table for all users' posts and then add an indexes to this table to improve the performance of searching. For example you can add an index on the user column so that you can quickly find all posts for a given user. You may also want to consider adding other indexes, depending on your application's requirements.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
4

Your first proposal of having a single user and a single post table is the standard approach to take.

At the moment posts may be the only user-specific feature on your site, but imagine that it might need to grow in the future to support users having messages, preferences, etc. Now your separate table-per-user approach leads to an explosion in the number of tables you'd need to create.

avik
  • 2,708
  • 17
  • 20
0

I have a similar but different issue with your answer because both @guffa and @driis are assuming that the "posts" need to be shared among users.

In my particular situation: not a single user datapoint can be shared for privacy reason with any other user not even for analytics.

We plan on using mysql or postgres and here are the three options our team is warring about:

N schema and 5 tables - some of our devs feel that this is the best direction to make to keep the data completely segregated. Pros - less complexity if you think of schema as a folder and tables as files. We'll have one schema per user Cons - most ORMs do connection pooling per schema

1 schema and nx5 tables - some devs like this because it allows for connection pooling but appears to make the issue more complex. Pros - connection pooling in the ORM is possible Cons - cannot find an ORM where Models are set up for this

1 schema and 5 tables - some devs like this because they think we benefit from caching.

Pros: ORMs are happy because this is what they are designed to do Cons: every query requires the username table

I, personally, land in camp 1: n schemas. My lead dev lands in camp 3: 1 schema 5 tables.

Caching: If data is always 1:1, I cannot see how caching will ever help regardless of the solution we use because each user will be searching for different info.

Any thoughts?