16 votes

Any other developers also strongly resistant to adding secondary data stores to their software?

I'm currently building an MVP for a startup, solo. We've got Postgres pulling triple duty as the go-to database for all normal relational data, a vector database with pgvector, and a job queue (With the magic of SELECT ... FROM "Jobs" WHERE ... FOR UPDATE SKIP LOCKED LIMIT 1). Every time I go out looking for solutions to problems it feels like the world really wants me to get a dedicated vector store or to use Redis as a job queue.

Back when I was a Rails developer a good majority of the ActiveJob implementers used Redis. Now that I'm doing NodeJS the go-to is Bull which can only serialize jobs to Redis. They back this with claims that I can scale to thousands of jobs per second! I have to assume this theoretical throughput benefit from using Redis is utilized by 0.01% of apps running Bull.

So I ended up implementing a very simple system. Bull wouldn't have been a good fit anyway as we have both Python and Typescript async workers, so a simple system that I fully understand is more useful at the moment. I'm curious who else shares my philosophy.

Edit: I'll try to remember to update everyone in a year with the real world consequences of my design choices.

34 comments

  1. [14]
    devilized
    Link
    I'm a big fan of using the right tools for the right job as opposed to shoehorning something into a place that it doesn't belong. The reason that Redis is used for job queueing so often is because...

    I'm a big fan of using the right tools for the right job as opposed to shoehorning something into a place that it doesn't belong. The reason that Redis is used for job queueing so often is because it's good at it. It has sorted sets with blocking retrieval built in.

    Postges is a relational database. It's good for storing and retrieving a variety of data that is at least somewhat relational and/or structural (you can certainly use JSON fields to store unstructured data as well). It is not all that effective as a job queue, especially when you consider that it's usually set up with a single master node, so it's often a performance bottleneck if you're using it for everything, including functionality that it isn't optimized for.

    At the end of the day, I have too much on my plate to be writing my own job queueing system. I can deliver more value from my projects by concentrating on properly implementing the product-differentiating features and leaving the boilerplate stuff to well-established components.

    16 votes
    1. [5]
      FluffyKittens
      (edited )
      Link Parent
      Postgres can pretty easily handle tens of thousands of writes per second. If a job queue is only ever going to require a few hundred* writes per second max, Postgres is a pretty great tool for...

      Postgres can pretty easily handle tens of thousands of writes per second. If a job queue is only ever going to require a few hundred* writes per second max, Postgres is a pretty great tool for that.

      If you truly require a heavyweight job queue, sure, jump for redis - but tea’s point is spot on: most job queues aren’t bottlenecks and will never be heavy enough to compete for IO.

      If you’ve done it before, writing a postgres queue as described takes <15 minutes - which generally is much less than the time needed to install and config redis or the like.

      17 votes
      1. teaearlgraycold
        Link Parent
        I'm doing it for the first time and it's just a few hours. I'm currently somewhat blocked on mission critical aspects of the app so I have extra time available anyway. I think we should always...

        I'm doing it for the first time and it's just a few hours. I'm currently somewhat blocked on mission critical aspects of the app so I have extra time available anyway.

        I think we should always have time to spare to at least evaluate ideas. Only using the default solution will lead you down the wrong path some of the time. If it will take 3-4 hours to evaluate the right solution I'm going to spend that time now rather than hate my decision for years to come.

        Edit:

        “If you aren’t sure which way to do something, do it both ways and see which works better.”
        — John Carmack

        5 votes
      2. bkimmel
        Link Parent
        After a couple decades of going through various tools and ecosystems for backend/data I've more or less settled on the mantra of "use Postgres until you can't".

        After a couple decades of going through various tools and ecosystems for backend/data I've more or less settled on the mantra of "use Postgres until you can't".

        5 votes
      3. [2]
        devilized
        Link Parent
        In a queueing system, it's more than just IO. Each queue worker is likely using its own connection as well. It's not just about 'writes per second', it's the fact that Postgres is transaction...

        In a queueing system, it's more than just IO. Each queue worker is likely using its own connection as well. It's not just about 'writes per second', it's the fact that Postgres is transaction based and there is a cost to each and every commit, especially if your queue is jammed into a single table.

        I guess I don't see Redis as a big deal to set up since I end up needing to include it into almost every project I do whether or not the project demands queueing. Redis is used for more than just queueing, it's great for caching as well. It's especially nice for things like autocompletion where doing a "select where like" on every keystroke can get expensive.

        Obviously there's a lot of project scope and personal preference that goes into decisions like this. I work for a large corp as opposed to startup, so most of my projects end up needing to be global-scale and Postgres often becomes a bottleneck.

        4 votes
        1. FluffyKittens
          Link Parent
          Yeah, it’s all a matter of magnitude. I do startup work where each client org has their own dedicated hardware and never more than a few dozen concurrent users, max. In context, table-based queues...

          Yeah, it’s all a matter of magnitude.

          I do startup work where each client org has their own dedicated hardware and never more than a few dozen concurrent users, max. In context, table-based queues and caching with unlogged tables is a no-brainer that doesn’t move the needle on resource usage. Transactions on those DBOs are basically never competing for locks and have near-zero MVCC penalty.

          Redis is absolutely the right call once you’re hitting a few thousand concurrent users, but I see it cargo-culted in contexts that will never meet that threshold all the time.

          4 votes
    2. [8]
      ewintr
      Link Parent
      As with everything in software engineering, it really depends. I am a big fan of using the right tool for the right job too, but often my conclusion is that it is not worth it to buy a new tool,...

      I'm a big fan of using the right tools for the right job as opposed to shoehorning something into a place that it doesn't belong.

      As with everything in software engineering, it really depends. I am a big fan of using the right tool for the right job too, but often my conclusion is that it is not worth it to buy a new tool, just for this bit of functionality. Some developers really underestimate the cost of setting up yet another service, connect it and maintain it and underestimate how simple writing a few lines of code can be.

      Having multiple data stores means multiple sources of truth. You now have a job queue, congratulations! You now have a whole new range of potential bugs and opportunities for complicated debugging sessions too.

      It could be worth it, of course, if this queuing functionality is important and integral to the system then I am all for it. Otherwise, maybe think again. Comparing myself to my peers, I am more on the frugal side with taking on dependencies. But honestly, I never really regretted that.

      10 votes
      1. [5]
        chundissimo
        Link Parent
        I disagree with the multiple sources of truth unless you’re using it as a cache. That’s a whole other can of worms. If it’s used to back jobs then that’s the only place the truth is stored. Unless...

        I disagree with the multiple sources of truth unless you’re using it as a cache. That’s a whole other can of worms. If it’s used to back jobs then that’s the only place the truth is stored. Unless you’re arguing against using SQL and Redis simultaneously to back jobs, which I agree with.

        Have you setup and managed a Redis instance? I agree with your general mindset here but I feel like it’s being misapplied here. Redis is one of the simplest stores out there.

        Because this system apparently is not mission critical, then sure the implementation doesn’t really matter. Implement a job queue and processor using cron and SMTP, who cares. But at the end of the day it’s not the right tool for the job and if job processing ever becomes more important or complex I wouldn’t want to be stuck with a homebrew SQL job processor.

        If you’re building an MVP for a startup then time is critical and cutting corners can make sense. But those tradeoffs have to be taken carefully. I’ve spent a lot of time at early startups cleaning up terrible technical decisions made during the MVP phase that anchored the companies during time in which they needed to be fast. I’m not saying this is necessarily that bad of a decision, but just that that line of thinking has risk too.

        4 votes
        1. [3]
          devilized
          Link Parent
          This was a viewpoint I was coming from as well in my original response. I've had to inherit enough crappy custom code where someone just didn't want to introduce a simple out-of-the-box component...

          I’ve spent a lot of time at early startups cleaning up terrible technical decisions made during the MVP phase that anchored the companies during time in which they needed to be fast.

          This was a viewpoint I was coming from as well in my original response. I've had to inherit enough crappy custom code where someone just didn't want to introduce a simple out-of-the-box component and thought that their own implementation was "cool" that I generally try to avoid doing that myself.

          3 votes
          1. [2]
            winther
            Link Parent
            Ugh this made me remember when I worked with someone's framework that implemented a SQL like database with Redis as a backend. It made no sense and the performance was terrible until we rewrote...

            Ugh this made me remember when I worked with someone's framework that implemented a SQL like database with Redis as a backend. It made no sense and the performance was terrible until we rewrote everything to use PostgreSQL instead. Conservative tried and tested tech stack decisions wins out in most cases.

            4 votes
            1. Micycle_the_Bichael
              Link Parent
              I will never forget trying to do a PoC on docker + k8s back in 2017/2018 and this guy looking at me and saying “why would we ever want to use kubernetes , I can replicate all that using Apache...

              I will never forget trying to do a PoC on docker + k8s back in 2017/2018 and this guy looking at me and saying “why would we ever want to use kubernetes , I can replicate all that using Apache Zookeeper” and feeling my soul die.

        2. ewintr
          Link Parent
          You are not necessarily storing the jobs in two places, but they will have to agree on some things. If Redis has a job that says 'Foo the Bar' and your SQL database says 'There is no Bar', then...

          disagree with the multiple sources of truth unless you’re using it as a cache. That’s a whole other can of worms. If it’s used to back jobs then that’s the only place the truth is stored. Unless you’re arguing against using SQL and Redis simultaneously to back jobs, which I agree with.

          You are not necessarily storing the jobs in two places, but they will have to agree on some things. If Redis has a job that says 'Foo the Bar' and your SQL database says 'There is no Bar', then you still have an issue.

          I’ve spent a lot of time at early startups cleaning up terrible technical decisions made during the MVP phase that anchored the companies during time in which they needed to be fast. I’m not saying this is necessarily that bad of a decision, but just that that line of thinking has risk too.

          I agree, but this applies to every decision you can make. You can cut corners if necessary. But if you forget to let the system mature with the rest of the company, you are in for unpleasant surprises.

          1 vote
      2. [2]
        unkz
        Link Parent
        Speaking from experience in writing many job queues in the days before redis and rabbitmq, I can assure you that writing your bespoke job queuing system is not a way to avoid a whole new range of...

        You now have a job queue, congratulations! You now have a whole new range of potential bugs and opportunities for complicated debugging sessions too.

        Speaking from experience in writing many job queues in the days before redis and rabbitmq, I can assure you that writing your bespoke job queuing system is not a way to avoid a whole new range of potential bugs and opportunities for complicated debugging sessions.

        4 votes
        1. ewintr
          Link Parent
          You can't make a statement like that without specifying the scope of the queuing functionality. I have written queues that are as simple as an array and some functions around it. Or a Go channel...

          You can't make a statement like that without specifying the scope of the queuing functionality. I have written queues that are as simple as an array and some functions around it. Or a Go channel and a type. I have also worked, for instance, on a system where RabbitMQ was the main form of communication between over a hundred microservices. You really cannot compare the two.

          2 votes
  2. archevel
    (edited )
    Link
    While there is such a thing as preferring to use the right tool for the job, I don't think it's as simple as "I need a queue so I must use kafka/redis/rabbitmq". There is also the consideration of...

    While there is such a thing as preferring to use the right tool for the job, I don't think it's as simple as "I need a queue so I must use kafka/redis/rabbitmq". There is also the consideration of your competency with various tools. Maybe you know Postgres very well. You already have backups set up for it. You know how to tune it and scale it in case you need to. In comparison, bringing on a new technology has a bunch of costs that are not obvious at first glance. Onboarding new people will be harder. You'll have a more complex setup of your dev environment. The operational cost of managing the infrastructure is higher etc etc. Some of this can of course be mitigated, but complexity will increase (arguably more than when building on an already established tool). In addition you have to consider how reliable you need your queue to be. Does it need at least/at most/exactly once delivery guarantees?

    In the end my rule of thumb is tonmot introduce new tech until it is needed. Mostly these type of things can be fairly easily isolated so swapping it out later isn't such a big issue.

    9 votes
  3. [5]
    em-dash
    Link
    I grumbled a bit adding Bull to my current personal project for the same reasons, but decided to just go with it for now because I also needed Redis for caching. The alternatives I considered were...

    I grumbled a bit adding Bull to my current personal project for the same reasons, but decided to just go with it for now because I also needed Redis for caching. The alternatives I considered were dedicated queueing systems like RabbitMQ, which would be comically heavy-duty for what I want to use it for.

    The difference, perhaps, is that I'm writing a thing targeted toward self-hosting, with little reason to ever run it in an enterprise setting. I feel a bit more desire to reduce dependencies so people can run the thing without learning to sysadmin five other things. In a real company, with people paid to do that, I think it's more okay to depend on external services.

    I actually had not considered just throwing the jobs in a database table. That could work, and would mean I can treat Redis as a completely volatile cache, which is nice. The polling hurts my sensibilities though.

    6 votes
    1. [4]
      admicos
      Link Parent
      Yeah, a lot if "best practices" I see thrown around seem to completely ignore selfhosting and other "non enterprise" use cases in favor of 🚀scaling🚀 where "dev ops" is an entire department of an...

      Yeah, a lot if "best practices" I see thrown around seem to completely ignore selfhosting and other "non enterprise" use cases in favor of 🚀scaling🚀 where "dev ops" is an entire department of an organization dedicated to setting up and maintaining all these.

      I myself try my hardest to keep external service dependencies as small as possible, which in my case ends up boiling down to "I need to have a really good reason to add anything beyond postgres and redis", with redis strictly being for data that wouldn't be catastrophic to lose / temporary.

      9 votes
      1. [3]
        em-dash
        Link Parent
        In my case, I even skipped postgres and went straight for sqlite for administrative simplicity: the thing I'm making is effectively a fancy domain-specific file store, so I have the config file...

        In my case, I even skipped postgres and went straight for sqlite for administrative simplicity: the thing I'm making is effectively a fancy domain-specific file store, so I have the config file point at a directory and just store both the files and the metadata database in there. You can back up the whole thing at once with anything that backs up directories of files, and it'll be internally consistent.

        Now I'm wondering if I even need sqlite, or should just store a metadata.txt next to each file and do searching by spawning ripgrep processes.

        2 votes
        1. [2]
          teaearlgraycold
          Link Parent
          Why even have files? You can store BLOBs in sqlite

          Why even have files? You can store BLOBs in sqlite

          2 votes
          1. em-dash
            Link Parent
            Because when you download the (potentially large) file I can just sendfile() it :) (the thought did occur to me though)

            Because when you download the (potentially large) file I can just sendfile() it :)

            (the thought did occur to me though)

  4. [7]
    chundissimo
    Link
    It’s a good instinct to hesitate to add complexity to your system. If your SQL solution works and you like it then by all means go for it, but personally I’ll take a Redis backed async job...

    It’s a good instinct to hesitate to add complexity to your system. If your SQL solution works and you like it then by all means go for it, but personally I’ll take a Redis backed async job processor any day. I find whenever I start adding async jobs to an app, my usage and complexity tend to increase over time. It’s nice to have a battle tested paradigm you know will be up to the challenge (even if it stays well below its maximum capabilities). Besides, spinning up a secondary data store like Redis and wiring it into your app is trivial, and depending on your provider so are the costs.

    On your point of needing to enqueue workers from different runtimes, that’s actually not a big hurdle in most cases. You just have to figure out the internal job structure the job processor uses. I implemented this for Sidekiq to work in Python and Rails and I’ve had no issues with it.

    My main gripe with your philosophy is it can lead to reinventing the wheel for no good reason. Sure you should avoid “default” solutions and be careful about bringing in complexity, but you should also strive to recognize when more robust software exists for your use case. Maybe in your particular use case here rolling your own job system makes sense, but I have a hard time rationalizing it as a good short or long term decision without knowing the specifics.

    5 votes
    1. [6]
      teaearlgraycold
      Link Parent
      I think the “battle tested” benefits for something so simple are outweighed by the issue of having two sources of truth, needing two backups, etc. etc. Going from 1 to 2 is when you lose all of...

      I think the “battle tested” benefits for something so simple are outweighed by the issue of having two sources of truth, needing two backups, etc. etc. Going from 1 to 2 is when you lose all of the elegant benefits and outright cheats you can implement with just one of something.

      5 votes
      1. [5]
        chundissimo
        Link Parent
        It’s not two sources of truth unless you’re storing the same thing in both, which you shouldn’t be. Backups should be handled by your hosting provider. Why use Postgres at all? Just write to a...

        It’s not two sources of truth unless you’re storing the same thing in both, which you shouldn’t be. Backups should be handled by your hosting provider.

        Why use Postgres at all? Just write to a file and roll your own DB, it can be trivially implemented. Of course, a job processor is easier to implement, but my point is that it’s really not elegant and I guarantee if this startup ever gets traction it will be replaced. Which is fine, but I’m confused as why this path is attractive when it’s barely easier than installing something like Sidekiq (or whatever library), and spinning up a Redis instance. It’s not Kubernetes.

        2 votes
        1. [3]
          teaearlgraycold
          Link Parent
          Imagine there's a critical issue with Postgres. No problem - you've got PITR set up with your hosting provider. You roll back. But now your Redis, which stores persisted job data, is out of sync...

          Imagine there's a critical issue with Postgres. No problem - you've got PITR set up with your hosting provider. You roll back. But now your Redis, which stores persisted job data, is out of sync with Postgres. Awkward to deal with. The same difficulty could happen in the reverse scenario.

          Because job data is derived from your relational data it duplicates some information. So maybe not quite a second source of truth, but a source of inconsistencies.

          Also funny you should mention kubernetes. We ended up needing to use k8s for this project due to the immature PaaS solutions available for GPU compute.

          2 votes
          1. [2]
            chundissimo
            Link Parent
            Okay maybe this is the point I was missing. Why would it need to persist any job data other than briefly while the job is queued. Why would it need to be “in sync” with Postgres? These are aspects...

            which stores persisted job data

            Okay maybe this is the point I was missing. Why would it need to persist any job data other than briefly while the job is queued. Why would it need to be “in sync” with Postgres? These are aspects I wouldn’t consider to be part of a Redis backed job system. If it’s just processing stats then you can deal with that, but otherwise I’m skeptical of what’s going on here.

            because job data is derived … it duplicates

            Does it really though? It should be approached as a view of the system at that point in time. It shouldn’t be persisted long term in Redis.

            The fact that you’re balking at Redis while happily incorporating the infamous complexity and operational overhead of Kubernetes is baffling to me. Maybe both decisions make sense due to the specific problem domain, but I’m at a loss for how to persuade you here.

            1 vote
            1. teaearlgraycold
              Link Parent
              Oh I'm not happy to include it. I am at least using managed kubernetes (EKS), and EKS tooling (eksctl) to make it as easy on me as possible. If our primary host added GPU instances I'd drop EKS...

              while happily incorporating the infamous complexity and operational overhead of Kubernetes

              Oh I'm not happy to include it. I am at least using managed kubernetes (EKS), and EKS tooling (eksctl) to make it as easy on me as possible. If our primary host added GPU instances I'd drop EKS right away (side note - all of the instances across EKS and render.com are in the same AZ which is nice). So far k8s has been pretty nice actually. I just needed to learn how to design a service for running in that environment. Beyond that I hope to need very little insight into actually running a cluster. I suppose much of the famed k8s complexity comes from both building your own cluster from scratch and designing a microservice-first application which heavily depends on it. Our app is monolithic with the exception of functionality that needs to run on GPUs.

              Our async jobs are a bit longer lived than normal. I try to break up jobs into smaller chunks, but a job could still theoretically get queued for many hours and take a few hours to execute. I do enjoy knowing that all of the jobs' writes are idempotent and fully synchronized with the app db.

              Do you have experience with k8s? I'd love to get some of your wisdom on it if so.

        2. FluffyKittens
          Link Parent
          Siloing can be problematic, no duplication required. Here’s a concrete example: say you store your embeddings in a dedicated vector DB instead of postgres. Later on, you want set up a batch job to...

          Siloing can be problematic, no duplication required.

          Here’s a concrete example: say you store your embeddings in a dedicated vector DB instead of postgres. Later on, you want set up a batch job to capture some summary stats on those embeddings and serve them somewhere.

          If it were all in postgres, you could schedule a stored procedure to refresh a matview and you’re done. No application code required, just DDL.

          If you’ve got the embeddings siloed though, you’ve gotta configure a worker to run your summary stats on the vector DB and then move them over to your primary DB. You don’t have robust integrity constraints like a foreign key tying the embeddings to the user table, so you’ve gotta worry about handling orphan records. You’re gonna be passing the data through a round of serialization and deserialization using drivers that aren’t necessarily consistent with one another, opening yourself to all sorts of subtle and funky bugs in the vein of null handling, text-encoding, and case-sensitivity. Much more can go wrong.

          2 votes
  5. winther
    Link
    It depends on where in the product lifecycle you are. If it is startup with a very small team that still needs to get a functioning MVP out the door to test their business plan, then now is...

    It depends on where in the product lifecycle you are. If it is startup with a very small team that still needs to get a functioning MVP out the door to test their business plan, then now is probably not the time to add complexity to the tech stack. However the risk is that if the business actually gets successful, then it can become much harder to convince the decision makers than the engineers need time to refactor and build the MVP platform for better scalability and stability.

    Where I work we have over more than a decade transitioned from SQL based queue, to Redis and now to RabbitMQ. But it probably wouldn't have been a good idea to try and build our current setup from the beginning. And I would argue many applications don't need the scalability they think they need, and end up over-engineering prematurely. So your approach is totally sensible, as long as one is aware of potential future pitfalls and need for refactoring if the business actually scales up.

    5 votes
  6. shrike
    Link
    Adding more moving parts adds complexity, complexity adds to the risk of something going wrong. My usual projects start with Sqlite or Redis, depending on the type of data I want to store. The I...

    Adding more moving parts adds complexity, complexity adds to the risk of something going wrong.

    My usual projects start with Sqlite or Redis, depending on the type of data I want to store.

    The I upgrade Sqlite to Postgres when I need queries that are more complex than Sqlite can handle. I've yet to find the limits of Redis performance-wise, but at some point it's easier to just use a relational database rather than try to force your data to the redis key-value format with weird key naming tricks.

    4 votes
  7. blitz
    Link
    The huge benefit of using a single database for async queues is that you get transactional consistency without putting in any extra effort. If you've got data in your SQL database and jobs in your...

    The huge benefit of using a single database for async queues is that you get transactional consistency without putting in any extra effort. If you've got data in your SQL database and jobs in your MQ, you've got to think about what happens if you take a task but the transaction to update the data fails in SQL. What happens if there's an error in the task processing after you've committed your SQL changes but before the task finishes? You're essentially running a distributed database system, and you need to develop some sort of two-phase commit protocol to keep everything in sync.

    If you use your SQL database as a message queue, you commit the data that is changed by the task AND the fact that the task was completed in a single SQL transaction. This simplifies error handling logic so much. Was there an error in the task? The task doesn't get marked as complete and no data has changed, and another runner can pick it up later to retry.

    I've used Postgres for handling tasks of many hundreds of requests per second and it's handled it fine. I haven't needed to stand up a message queue service yet and I wouldn't want to for as long as I can get away with handling messages in Postgres.

    4 votes
  8. stu2b50
    Link
    Yep. I try not to add any dependencies I don’t need to. All I see when I look at an additional datastore is something else that can page me at 3am. Once benchmarks, load tests, or empirical...

    Yep. I try not to add any dependencies I don’t need to. All I see when I look at an additional datastore is something else that can page me at 3am. Once benchmarks, load tests, or empirical traffic shows that we need something more specialized, then I begrudgingly add it.

    3 votes
  9. borntyping
    Link
    I've always found myself very hestiant to add a second data store to anything based on my past experience in operations teams. Understanding a single database well enough to run it well at any...

    I've always found myself very hestiant to add a second data store to anything based on my past experience in operations teams. Understanding a single database well enough to run it well at any kind of scale isn't easy, and the end result of adding more and more types of datastore to a platform seems to be that you end up with several that no-one in the company understands well enough to fix serious issues with. It also seems to end up making the barrier to adding new types of datastore much lower, and soon there are 5+ different types and very few of them are doing anything that actually benifits from the new datastore.

    For a few years, I've joked that my answer to the question "what database should I use?" is always PostgreSQL, on the basis that a) it's good enough for most uses, and b) anyone that really has a use case that needs something more specific probably doesn't need to ask for help picking a database.

    3 votes
  10. GOTO10
    Link
    I use postgres for everything unless I really really can't anymore. The fewer moving parts is the most important (dev installs, kubernetes busywork, general software updates (both server and...

    I use postgres for everything unless I really really can't anymore. The fewer moving parts is the most important (dev installs, kubernetes busywork, general software updates (both server and client libraries!)).

    2nd is familiarity with tools. Having your queue in postgres means you can also inspect it with simple sql commands. Something gone wrong? No problem, we DELETE whatever is wrong, or run an update fixing the broken parts. Want to check just the stuff from one customer? No problem either.
    Want to extend it with some more fine grained priority system? No problem, you can add that since you're 100% in control.
    You can add monitoring to your existing prometheus stuff, since all that infra is in place already anyway. And again, you can add whatever /you/ need to know about your queue performance.

    Is there a limit? Sure. Will you reach it? That I can't say.

    2 votes
  11. whbboyd
    Link
    Adding an additional data store is the same as adding any other dependency; there are costs to adding it, benefits to adding it, costs to writing the functionality you need yourself, and benefits...

    Adding an additional data store is the same as adding any other dependency; there are costs to adding it, benefits to adding it, costs to writing the functionality you need yourself, and benefits to writing it yourself. There's no magic rule for evaluating these tradeoffs, but I've always found thinking about it explicitly in terms of tradeoffs to be helpful.

    In your case, the costs of adding additional infrastructure are pretty steep. There's operational cost to keeping the service running, knowledge cost to understanding how it works (well above e.g. introducing a new library to your application), cost for the hardware for the service to actually run on. From the situation you've described, I would definitely use Postgres and not introduce new infrastructure for message queueing—but, I would make very, very sure to totally encapsulate the view into the message queue to the application, so if there's a need to migrate later, it's as painless as possible.

    (Postgres is a surprisingly great message queue, though. Just sayin'.)

    Just to flagrantly advocate for the devil for a second, though… it's pretty likely that your application already has secondary data stores. Every nontrivial server application I've ever worked on does. Does your application store objects in S3 or the like? Use temp files? (This can really easily happen without your knowledge.) Talk to other services or a stateful UI? Think for a second about whether de facto you've already given up on internal consistency in your application. That can definitely influence the weighting of adding another. (For instance, if you've already forfeited consistency between services, that immediately becomes much less of a concern.)