3

I'm creating a small service where I poll around 100 accounts (in a Twitter-like service) frequently (every 5 seconds or so) to check for new messages, as the service doesn't yet provide a streaming API (like Twitter actually does).

In my head, I have the architecture planned as queuing Tickers every 5 seconds for every user. Once the tick fires I make an API call to the service, check their messages, and call SELECT to my Postgres database to get the specific user details and check the date of the most recent message, and if there are messages newer than that UPDATE the entry and notify the user. Repeat ad nauseum.

I'm not very experienced in backend things and architecture, so I want to make sure this isn't an absolutely absurd setup. Is the amount of calls to the database sensible? Am I abusing goroutines?

Doug Smith
  • 29,668
  • 57
  • 204
  • 388

3 Answers3

1

Let me answer given what you describe.

I want to make sure this isn't an absolutely absurd setup.

I understand the following. For each user, you create a tick every 5 seconds in one goroutine. Another goroutine consumes those ticks, performing the polling and comparing the date of the last message with the date you have recorded in your PostgreSQL database.

The answer is: it depends. How many users do you have and how many can your application support? In my experience the best way to answer this question is to measure performance of your application.

Is the amount of calls to the database sensible?

It depends. To give you some reassurance, I have seen a single PostgreSQL database take hundreds of SELECT per second. I don't see a design mistake, so benchmarking your application is the way to go.

I am abusing goroutines?

Do you mean like executing too many of them? I think it is unlikely that you are abusing goroutines that way. If there is a particular reason you think this could be the case, posting the corresponding code snippet could make your question more precise.

Community
  • 1
  • 1
mrrusof
  • 362
  • 2
  • 8
  • Okay, phenomenal, thank you. I'll actually go finish building it and if I encounter any performance woes I'll post accordingly. Just wanted to make sure that I wasn't making any incredibly foolish mistakes. – Doug Smith Apr 24 '16 at 14:50
  • 1
    Regarding your comment on Clément's answer, 'So there's not anything that I said that makes you wince in horror though?' I do not see anything like that. What I do not see is that you are applying goroutines for a reason. The code in goroutines will not execute in parallel unless your machine can do so and you tell Go to execute goroutines in parallel. Even if goroutines do not execute in parallel, executing goroutines concurrently opens the possibility of waiting at any given time for multiple SQL queries to finish. Sorry, I cannot comment in other answers, I don't have enough points yet. – mrrusof Apr 24 '16 at 17:06
  • For more on concurrency vs parallelism in Go, I suggest reading the material indicated by https://blog.golang.org/concurrency-is-not-parallelism – mrrusof Apr 29 '16 at 18:05
  • Do not disregard Clément's or Icza's answers. They touch complementary aspects of Doug's problem. – mrrusof Apr 29 '16 at 18:07
1
  • Is your architecture the most efficient way to go ? No.
  • Should you do something about it now ? No, you should test your solution.

You can always go deeper with optimisations, in your case you need client throughput so you can use a bunch of well known optimisations like switching to a reactive model, add some cache server, spread the load on multiple DB slaves, ...

You should test your solution at scale, if it fits your needs in term of user throughput and server cost, then your solution is the right one.

Clément Prévost
  • 8,000
  • 2
  • 36
  • 51
  • Sounds good! So there's not anything that I said that makes you wince in horror though? – Doug Smith Apr 24 '16 at 16:24
  • Nope, interval polling is a totally valid option for a starting project IMHO. It's simple to setup and maintain. Also, as @mrrusof said, SQL databases are good at handling many concurrent read/write requests and golang can withstand a LOT of goroutines. – Clément Prévost Apr 24 '16 at 16:31
1

Your proposed solution: 1 query in every 5 seconds for every user. Having 100 users this is:

1 * 100 / 5 seconds = 20 queries / second

This is not considered a big load if the queries are fast.

But why do you need to do this for every users separately? If you need to pick up updates in the granularity of 5 seconds, you could just execute 1 query in every 5 seconds which does not filter by user but checks for updates from all the users.

If the above query gives results, you can iterate over the results and do the necessary for each user that had updates in the last 5 seconds. This results in:

1 query / 5 seconds = 0.2 query / second

Which is a hundred times less queries, still getting you all the updates in the same time granularity.

If the task to be performed for the updates is long or depends on external systems (e.g. a call to another server), you may perform those tasks in separate goroutines. You may choose to either launch a new goroutine for each task, or you may have a pool of worker goroutines which consume these queued tasks, and just queue the task (using channels).

icza
  • 389,944
  • 63
  • 907
  • 827