35 votes

Cloud exit - cloud is NOT cheap

34 comments

  1. [9]
    winther
    Link
    Every company has different needs and different skillsets in its employees. There is never any absolutes here. At my company we have the unfortunate luck of having the need to have our entire...

    Every company has different needs and different skillsets in its employees. There is never any absolutes here.

    At my company we have the unfortunate luck of having the need to have our entire platform deployed twice. One is in Google Cloud, the other is self managed VMs in a different provider. The raw server cost is massively higher in Google, but the man hours needed to keep the other setup even close to the level of stability of the Google Cloud setup makes the total way more expensive. Especially if you are a smaller company, it is often easier and cheaper to just pay for managed redundant services with automatic failover than manpower to do it yourself. Our cost is also many times less than what is in this blogpost though.

    28 votes
    1. [4]
      crius
      Link Parent
      If Google cloud is costing more than the other alternative (if the other is aws ir azure) you might want to review your infrastructure design or check in details what kind of discounts you are...

      If Google cloud is costing more than the other alternative (if the other is aws ir azure) you might want to review your infrastructure design or check in details what kind of discounts you are maybe not taking advantage of.

      5 votes
      1. [2]
        winther
        Link Parent
        It is about the same. Basic VMs have about the same price everywhere. The expensive parts is having a managed kubernetes cluster and managed databases with high availability.

        It is about the same. Basic VMs have about the same price everywhere. The expensive parts is having a managed kubernetes cluster and managed databases with high availability.

        3 votes
        1. g33kphr33k
          Link Parent
          Multi-geo HA with redundancy is the really expensive bit, however, that makes perfect sense. That's multiple servers in multiple countries in multiple buildings with multiple power bills and...

          Multi-geo HA with redundancy is the really expensive bit, however, that makes perfect sense. That's multiple servers in multiple countries in multiple buildings with multiple power bills and multiple pipes to pay for, plus the staff to manage at each site.

          If you have a large international organisation and the staff you can do it cheaper. If you are a 6 man Dev shop based in Glasgow with a world wide reach dominating application, you pay the cloud providers.

          3 votes
      2. flowerdance
        Link Parent
        I was mindblown when I realized Google Cloud Platform (GCP) turned out to be really expensive (I was comparing basic services like Compute and Storage). Then I went, "Omg then what's their edge??"

        I was mindblown when I realized Google Cloud Platform (GCP) turned out to be really expensive (I was comparing basic services like Compute and Storage). Then I went, "Omg then what's their edge??"

    2. [4]
      vord
      Link Parent
      Why do you need Google Cloud if you already have the staff for self-managed VMs? The tooling is much better these days, if you already have the staff to build out and maintain an HA solution on...

      Why do you need Google Cloud if you already have the staff for self-managed VMs?

      The tooling is much better these days, if you already have the staff to build out and maintain an HA solution on VMs scaling it out from there is relatively trivial.

      Doing what was described in the article's cloud exit (Buying hardware and sending direct to datacenter for setup), it becomes even cheaper.

      1 vote
      1. [3]
        winther
        Link Parent
        We are like four people in total working on the technical platform. Cutting our Google cost would only pay half a salary. Our devops are currently spending like 90% of their time on the other...

        We are like four people in total working on the technical platform. Cutting our Google cost would only pay half a salary. Our devops are currently spending like 90% of their time on the other setups and achieving the same level of automatic failover on redundant setup to the same quality of Google that hasn’t failed us for more than 1 hour in the last 5 years is somewhat less than trivial. Especially because we are not allowed to use American companies due to Schrems II fears for our customers, it is not easy to achieve the same level of network stability with data centers less sophisticated than Googles or Amazons infrastructure.

        5 votes
        1. [2]
          vord
          Link Parent
          Oh I wasn't saying building or maintaining it is trivial. I was saying if you've already got 10 VMs in the mix doubling to 20 is comparitively trivial.

          Oh I wasn't saying building or maintaining it is trivial. I was saying if you've already got 10 VMs in the mix doubling to 20 is comparitively trivial.

          2 votes
          1. TumblingTurquoise
            Link Parent
            Maybe they need the Cloud VMs as a failover in the case of a DR situation, or some outage.

            Maybe they need the Cloud VMs as a failover in the case of a DR situation, or some outage.

            2 votes
  2. [17]
    g33kphr33k
    Link
    I'm posting this as it's interesting to me. So many companies were sold the cloud and went there, eggs in the basket. Then they gave Devs and teams a credit card to spend wisely on their new...

    I'm posting this as it's interesting to me. So many companies were sold the cloud and went there, eggs in the basket. Then they gave Devs and teams a credit card to spend wisely on their new virtual infrastructure with absolutely crazy consequences.

    I'm a CapEx guy for the company I am at, not because I don't like OpEx, but because we're a limited company and CapEx depreciates over time. I've never been sold on cloud as we deal with petabytes of video and cloud storage isn't cheap.

    I've been watching the HEY/Basecamp posts for a while now and the money they're saving being going to on-prem is nuts.

    12 votes
    1. [8]
      Pioneer
      Link Parent
      I'm a data guy. Cloud works really well for us. But my god does it need sensible engineers building things in order to not end up expensive. Back in the day, you had a DBA or architect come and...

      I'm a data guy. Cloud works really well for us.

      But my god does it need sensible engineers building things in order to not end up expensive.

      Back in the day, you had a DBA or architect come and slap you round the back of the head of you made a crap query. These days there's no such monitoring it's just throw more power at it!

      Engineers aren't really trained or educated in the area of performance management.

      15 votes
      1. [7]
        shrike
        Link Parent
        This is why I get paid the big bucks :) It's WAY too easy to whip up a "Cloud service" on any of the big providers without actually knowing what it'll cost. AWS provides excellent tools with...

        But my god does it need sensible engineers building things in order to not end up expensive.

        This is why I get paid the big bucks :)

        It's WAY too easy to whip up a "Cloud service" on any of the big providers without actually knowing what it'll cost.

        AWS provides excellent tools with tagging and everything you can use to see what bits are too expensive compared to their usefulness both before you implement and while running.

        Using someone else's servers is a skill and not everyone has it.

        6 votes
        1. [6]
          Pioneer
          Link Parent
          Ha. Same mate. I lead departments and my first thing anywhere is always to review what's been built, then see where we can cut down on costs on cloud compute. The amount of engineers who will do...

          Ha. Same mate. I lead departments and my first thing anywhere is always to review what's been built, then see where we can cut down on costs on cloud compute.

          The amount of engineers who will do complex data transforms in python... Inside a data warehouse drives me insane.

          GCP pricing is obscene, AWS feels dated and grumpy and Azure feels like you're working for your dad. But they all have this substantial problem with sudden overspend.

          5 votes
          1. [5]
            shrike
            Link Parent
            Just thinking about the amount of coders who use ORMs to pull a gigabyte database to the business logic and their first query drops 90% of the data gives me heartburn every time :D GCP is...

            The amount of engineers who will do complex data transforms in python... Inside a data warehouse drives me insane.

            Just thinking about the amount of coders who use ORMs to pull a gigabyte database to the business logic and their first query drops 90% of the data gives me heartburn every time :D

            GCP is expensive, but it does have its uses (big data stuff mostly). Azure is just bonkers, the only ones who use it are somehow either getting a sweetheart deal from Microsoft or linked to them in some other way they can't break.

            AWS is the only one of the three where you can easily get an actual human to advise you on how to do stuff - they even tell you how you can cut costs.

            2 votes
            1. D_E_Solomon
              Link Parent
              At least for Azure, folks going to them tend to be enterprises who already have a Microsoft commitment. So they're not just negotiating the Azure, but their entire usage. Moreover, often Microsoft...

              At least for Azure, folks going to them tend to be enterprises who already have a Microsoft commitment. So they're not just negotiating the Azure, but their entire usage. Moreover, often Microsoft is giving breaks on software licensing for workloads running on Azure. So it's less the price of the service and more of the entire deal.

              6 votes
            2. Pioneer
              Link Parent
              Don't even start me in Ingestion pathways that stream... Into a microbatch service. Honestly, AWS are okay. I've been using them for ten years, but they just feel dated now. It's clunky, there's...

              Don't even start me in Ingestion pathways that stream... Into a microbatch service.

              Honestly, AWS are okay. I've been using them for ten years, but they just feel dated now. It's clunky, there's no standard patterns and it feels like the tooling is so multipurpose that you can end up in architectural hell spinning your wheels on "The right way" to do things.

              Azure seem to have got the idea that not all tools should do everything, but they're yet to truly distill that down.

              3 votes
            3. [2]
              tauon
              Link Parent
              As an “outsider” – naively asked – what would be a (high-level) way of preventing this problem? Would you send an initial request to “drop” the majority of the data while still server-side, before...

              Just thinking about the amount of coders who use ORMs to pull a gigabyte database to the business logic and their first query drops 90% of the data gives me heartburn every time :D

              As an “outsider” – naively asked – what would be a (high-level) way of preventing this problem? Would you send an initial request to “drop” the majority of the data while still server-side, before sending it out?

              (P.S., what is ORM?)

              1 vote
              1. shrike
                Link Parent
                ORM is an Object-Relation Mapper. Basically the programmer only sees an object that says Inventory, then they call Inventory.get_items(), because that's how you get all the items and start...

                ORM is an Object-Relation Mapper.

                Basically the programmer only sees an object that says Inventory, then they call Inventory.get_items(), because that's how you get all the items and start filtering from there.

                ...But there are 500000 items in the database along with a bunch of metadata, order history and all that jazz. All that gets fetched from the database server to the application server, causing a huge amount of useless traffic. But inexperienced coders don't know this because they're working with a copy of the database that has maybe 10 items for development and it "Works on their machine(tm)" =)

                The old-school way is not to bother with ORMs and use direct SQL where you can pick just the Inventory items you want to fetch on the database itself and only those will get transferred to the application.

                (And yes, you can use ORM to filter the stuff on the DB side too, but I've seen that fail so many times I don't bother counting anymore. One misconfiguration and you've fetched the whole damn DB accidentally.)

                1 vote
    2. [8]
      flowerdance
      Link Parent
      With petabytes, you definitely need your own storage solution. Even doing redundancy on the drives by buying 2-3x more drives than necessary for the actual storage is much cheaper. Should pay for...

      With petabytes, you definitely need your own storage solution. Even doing redundancy on the drives by buying 2-3x more drives than necessary for the actual storage is much cheaper. Should pay for itself in just a year or two at most, and given the lifespan and failure rate of these drives, I'd say 2 years is alright.

      7 votes
      1. [7]
        supergauntlet
        Link Parent
        as someone that does this on my own time you don't need 300% spend for redundancy like that, generally 160%-200% is enough.

        as someone that does this on my own time you don't need 300% spend for redundancy like that, generally 160%-200% is enough.

        6 votes
        1. flowerdance
          Link Parent
          I mean yeah true, it was a conservative estimate erring on the side of caution.

          I mean yeah true, it was a conservative estimate erring on the side of caution.

          2 votes
        2. [5]
          cykhic
          Link Parent
          I'm curious how 160% would be achieved? Naively I would expect that backups would have to be at least 1 to 1. Unless perhaps some compression is happening, which doesn't seem possible for data...

          I'm curious how 160% would be achieved? Naively I would expect that backups would have to be at least 1 to 1.

          Unless perhaps some compression is happening, which doesn't seem possible for data that changes more often than "rarely".

          1 vote
          1. psi
            (edited )
            Link Parent
            Yes, 160% is possible (even without compression). As a conceptually simpler example, suppose we have 3 hard drives of equal size and wish to use one as a backup. To accomplish this, we can use the...

            Yes, 160% is possible (even without compression). As a conceptually simpler example, suppose we have 3 hard drives of equal size and wish to use one as a backup. To accomplish this, we can use the first two as usual; but for the third (backup) drive, we take the first bit from drive 1 and the first bit from drive 2, add them together (i.e., calculate the exclusive-or between the pair of bits), and save the result to the first bit of drive 3. Repeat this process for every bit. You should get something like

            D1  0 1 1 0 1 ...
            D2  1 1 0 0 1 ...
            D3  1 0 1 0 0 ...
            

            Now if either drive 1 or drive 2 fails (but not both), we can perfectly recover the lost drive by subtracting the bits of the working drive from the bits of the backup drive (check it!).

            In fact, one can generalize to an arbitrary numbers of in-use drives and an an arbitrary numbers of backup drives (the "code rate") using Reed–Solomon error correction so long as the backup drives are larger than the in-use drives.

            6 votes
          2. [2]
            supergauntlet
            Link Parent
            an 8 drive raidz3 pool is ~55% usable which means you need to allocate 180% of your original size budget to get that level of redundancy, but raidz3 for 8 drives is bordering on unnecessary...

            an 8 drive raidz3 pool is ~55% usable which means you need to allocate 180% of your original size budget to get that level of redundancy, but raidz3 for 8 drives is bordering on unnecessary redundancy. raidz2 is a lot more reasonable for 8 drives, which is almost 70% usable, or closer to 150% budget.

            zfs raid does both striping and parity which is why it's more complicated.

            4 votes
            1. cykhic
              Link Parent
              This is pretty fascinating, I hadn't known how RAID works until searching it up based on your comment. Thanks!

              This is pretty fascinating, I hadn't known how RAID works until searching it up based on your comment. Thanks!

              2 votes
          3. flowerdance
            Link Parent
            Keyword is "spend". Just do bulk order and you get drives cheaper. So instead of 200% spend for a two-replication set up, it could just be 160% spend.

            Keyword is "spend". Just do bulk order and you get drives cheaper. So instead of 200% spend for a two-replication set up, it could just be 160% spend.

            2 votes
  3. Handshape
    Link
    I'm an old grump, but I can do math. I've watched the tide roll in and out on a bunch of tech panaceas, and cloud has distinguished itself in how long it's taken for the industry to realise that...

    I'm an old grump, but I can do math. I've watched the tide roll in and out on a bunch of tech panaceas, and cloud has distinguished itself in how long it's taken for the industry to realise that the vendors (once again) don't have altruistic motives.

    I priced out a very small GPU based service for my shop this year on their cloud service provider of choice, and it was a spit-take moment when the quote came back. I then priced out an equivalent on-prem deployment using capex over five years and rack/power capacity that's been hollowed out by cloud migration. The crossover point on the curve was at between 2.5 and 3.5 months, depending on usage.

    Say it with me: any vendor that tells you the story of Jack the Giant-Killer and then offers to sell you a bag of magic beans is not your friend.

    12 votes
  4. [6]
    devilized
    Link
    Our company did the math on this with the help of a major consulting firm a couple of years back. It would've cost us 50% more to host our services on AWS, even with their volume discounts, than...

    Our company did the math on this with the help of a major consulting firm a couple of years back. It would've cost us 50% more to host our services on AWS, even with their volume discounts, than we currently pay for our own hardware and datacenters. And we're talking about a nine-figure budget here. So we're taking the hybrid approach - if there's some advantage of moving to cloud, like extreme elasticity or some sort of technology that would be difficult to set up on-prem, then we'll go there. But for most of our internal applications and services, it just doesn't make sense.

    11 votes
    1. [3]
      crius
      Link Parent
      and that is the right approach. Cloud is not meant to be a panacea to every problem.

      and that is the right approach.

      Cloud is not meant to be a panacea to every problem.

      6 votes
      1. [2]
        g33kphr33k
        Link Parent
        That's where a lot of CIOs, DoTs and DITs go wrong, they get sold the idea and go "oooo" because too many high level execs aren't from an IT background, they are glorified bean counters that...

        That's where a lot of CIOs, DoTs and DITs go wrong, they get sold the idea and go "oooo" because too many high level execs aren't from an IT background, they are glorified bean counters that understand some of the words used in IT-land.

        SharePoint for docs in cloud makes sense. Teams for global Collab, awesome stuff. Storing raw shot 8k video to edit on a workstation, no, definitely not.

        3 votes
        1. [2]
          Comment deleted by author
          Link Parent
          1. g33kphr33k
            Link Parent
            Chief Information Officers Director of Technology Director of IT I didn't say CEOs.

            Chief Information Officers
            Director of Technology
            Director of IT

            I didn't say CEOs.

            2 votes
    2. [2]
      shrike
      Link Parent
      For Really Big Companies doing on-prem might be doable. They can bear the cost of hiring enough people to wrangle the servers and services 24/7. That's the bit that keeps small and medium...

      For Really Big Companies doing on-prem might be doable. They can bear the cost of hiring enough people to wrangle the servers and services 24/7.

      That's the bit that keeps small and medium companies using the cloud.

      We run a top10 mobile game backend with 2-3 people on the infra team. They would all be burned out and looking for new jobs if most of the services wouldn't be managed by AWS. Their job is mostly watching that Terraform deployments and infra changes go through during office hours.

      After that very few things can break where they can do anything else than look at the AWS status page and see stuff get fixed.

      Just having a 24/7 rotation of "DevOps" people handling actual physical servers would cost more than what we spend on AWS monthly. Never mind the cost of having to actually buy the physical servers, keep them maintained and upgraded etc.

      5 votes
      1. devilized
        Link Parent
        Yeah, I agree that company size has a lot to do with it. We're a global company with an IT staff in the thousands, and benefit from economies of scale just like AWS (not to the same degree, but...

        Yeah, I agree that company size has a lot to do with it. We're a global company with an IT staff in the thousands, and benefit from economies of scale just like AWS (not to the same degree, but when you manage over 100k servers, your server:staff ratio can be very high). That makes the staffing and CapEx investments worth it compared to paying for another company to make a profit to do it for you.

        3 votes
  5. whs
    Link
    My company's major shareholder offers us at-cost (and sometimes undercharged) managed compute on their private cloud - cloud like experience without the hassle of managing the hardware or network...

    My company's major shareholder offers us at-cost (and sometimes undercharged) managed compute on their private cloud - cloud like experience without the hassle of managing the hardware or network and it's like 1/4 of AWS cost.

    But "securing" it is nigh impossible. The auditor would ask about database privilege access which we do not have. On AWS, you also don't have root and the auditor gives a pass if the partner is SOC2 certified. Which our parent company do not have. The PCI auditor ask for PCI cloud provider compliance (we do not transmit cardholder data, only Stripe-like token). I asked them what do they do in this case, they say that normally it's a audit finding of "do not use this cloud", but since management know and strongly prefer that, they'll have to think of something.

    So I believe if your environment needs as much certification as AWS does, it might cost similar. But if you don't, well they don't sell an uncertified version for less

    9 votes