107

I'm building a chat app and I want a full history off all messages ever sent in the chat conversation. At the moment I am storing each message as a single row in a table called 'messages'. I am aware that this table could grow huge as even small messages like 'Hi' would have their own database record.

Can anyone recommend a more scalable mysql solution? I don't require the individual messages to be searchable, editable or deletable. Could the whole conversation be stored in one huge field?

Would love to hear your ideas!

wilsonpage
  • 17,341
  • 23
  • 103
  • 147
  • 14
    if these messages don't need to be searchable or editable, there is no point to keep inside the database – ajreal Aug 15 '11 at 08:54
  • 35
    I would advise to start easy, think simple, use a relational database, and if scaling becomes an issue, deal with it! Too many people care about scenarios which will never occur because they spend too much time building the perfect infrastructure and they will not have the time to focus on what's important. – whirlwin Sep 09 '16 at 18:39

3 Answers3

62

There's nothing wrong with saving the whole history in the database, they are prepared for that kind of tasks.

Actually you can find here in Stack Overflow a link to an example schema for a chat: example

If you are still worried for the size, you could apply some optimizations to group messages, like adding a buffer to your application that you only push after some time (like 1 minute or so); that way you would avoid having only 1 line messages

Community
  • 1
  • 1
jasalguero
  • 4,142
  • 2
  • 31
  • 52
5

If we assume that you do not read the data too.

This sounds to me like an audit\logging requirement, if it is, you do not need a database to store the chat messages.

Just append the conversation to a text file (1 file per day?). The file could look like this:

chat-1-bob 201101011029, hi
chat-1-jen 201101011030, how are you?
chat-1-bob 201101011030, fine thanks.    
chat-1-jen 201101011035, have you spoken to bill recently?    
chat-2-bob 201101021200, hi
chat-2-bill 201101021201, Hey Bob,
chat-2-bill 201101021203, what time do you call this?
chat-2-bob 201101021222, about 12:22

I think you will find it hard to get a more simple scaleable audit solution.

If your requirements change and you need to search\edit\delete then a database would be more appropriate.

Kevin Burton
  • 11,676
  • 2
  • 24
  • 37
  • 2
    This sounds brilliant. Can anyone counter this argument? – Trip Jan 22 '16 at 22:14
  • 135
    Writing to a file is a terrible idea. In most server-side environments or cluster you wouldn't even be to guarantee your 2nd request even ended up on the same server as the file. Writing the file system is extremely slow and I/O bound. Sorry, I can't believe this got so many up votes. – Andy Fusniak Jun 20 '17 at 22:16
  • 10
    Writing and reading to a file is resource intensive. I think using a database of any kind should help reduce that resource delay. At the end of the day, databases store those information to file as well (just a bit differently). I think the given idea is fantastic to store archived chats, or chats older than 1 year or so. But nothing beats a simple db here. – Jay Patel - PayPal May 14 '18 at 17:43
  • 1
    The OP's vague requirement was to store ' a full history of all messages ever sent'.... not 'searchable, editable or deletable' this still sounds like simple logging\audit to me. It sounds like you are making assumptions as to how the chat app works, you have invented a need to reduce the reads by indexing the messages so you can justify a database, you can see why such vague questions get closed ;) – Kevin Burton May 15 '18 at 10:38
  • 6
    OP clearly says in database, apart from this terrible idea, this not answer the question – Lyoneel Dec 28 '18 at 17:35
  • In the first line I do show i understand the OP's question. Your response is just an opinion, and does not help anyone, unless you can back it up with a well reasoned argument. If you can clearly explain why it is so terrible I will remove (or improve). – Kevin Burton Dec 31 '18 at 10:23
  • If you grow up in the early days of learning coding, the first thing you would have learned that fine operations are expensive and using them for chat messages is the worst idea anyone can give to you. Never use files for storing chat messages. Storing them in relational database is the ultimate solution. It will grow up. I started a chatting app and it has grown too big with over million messages. MySQL is still holding up so no worries for me atm. You would think of optimizing database once it starts growing. – Dan Jul 01 '20 at 00:29
  • This answer does not scale well. If it was a good solution, then we wouldn't need databases for anything. – EzPizza Jun 28 '21 at 18:42
  • if the requirement is for simple audit and logging, the file system is an appropriate solution. Logging via the file system is standard practice. – Kevin Burton Sep 22 '21 at 10:17
  • Quick reminder to flaggers LQA reviewers: answers providing really bad or even wrong advice do no warrant a "very low quality" flag or deletion on their own - [From Review](https://stackoverflow.com/review/low-quality-posts/34149090) – Michael M. Apr 03 '23 at 00:05
4

You could create a database for x conversations which contains all messages of these conversations. This would allow you to add a new Database (or server) each time x exceeds. X is the number conversations your infrastructure supports (depending on your hardware,...).

The problem is still, that there may be big conversations (with a lot of messages) on the same database. e.g. you have database A and database B an each stores e.g. 1000 conversations. It my be possible that there are far more "big" conversations on server A than on server B (since this is user created content). You could add a "master" database that contains a lookup, on which database/server the single conversations can be found (or you have a schema to assign a database from hash/modulo or something).

Maybe you can find real world architectures that deal with the same problems (you may not be the first one), and that have already been solved.

Bernhard Kircher
  • 4,132
  • 3
  • 32
  • 38