Our cloud exit has already yielded $1m/year in savings : technology

[–] [email protected] 51 points 1 year ago (4 children)

Interesting how they have kept their ops team the same but now run an entire datacentre.

Overworked teams? I just can’t see how this is possible.

Not defending cloud hosting/costs etc. You generally pay more for cloud to then not have to deal with hardware maintenance, datacentre management. I didn’t see this directly in their post. Other than keeping the same size Ops team

[–] [email protected] 59 points 1 year ago (1 children)

I'm running both physical hardware and cloud stuff for different customers. The problem with maintaining physical hardware is getting a team of people with relevant skills together, not the actual work - the effort is small enough that you can't justify hiring a dedicated network guy, for example, and same applies for other specialities, so you need people capable of debugging and maintaining a wide variety of things.

Getting those always was difficult - and (partially thanks to the cloud stuff) it has become even more difficult by now.

The actual overhead - even when you're racking the stuff yourself - is minimal. "Put the server in the rack and cable it up" is not hard - my last rack was filled by a high school student in a part of an afternoon, after explaining once how to cable and label everything. I didn't need to correct anything - which is a better result than many highly paid people I've worked with...

So paying for remote hands in the DC, or - if you're big enough - just order complete racks with racked and pre-cabled servers gets rid of the "put the hardware in".

Next step is firmware patching and bootstrapping - that happens automatically via network boot. After that it's provisioning the containers/VMs to run on there - which at this stage isn't different from how you'd provision it in the cloud.

You do have some minor overhead for hardware monitoring - but you hopefully have some monitoring solution anyway, so adding hardware, and maybe have the DC guys walk past and inform you of any red LEDs isn't much of an overhead. If hardware fails you can just fail over to a different system - the cost difference to cloud is so big that just having those spare systems is worth it.

I'm not at all surprised by those numbers - about two years ago somebody was considering moving our stuff into the cloud, and asked us to do some math. We'd have ended up paying roughly our yearly hardware budget (including the hours spent on working with hardware we wouldn't have with a cloud) to host a single of one of our largest servers in the cloud - and we'd have to pay that every year again, while with our own hardware and proper maintenance planned we can let old servers we paid for years ago slowly age out naturally.

[–] [email protected] 7 points 1 year ago

Thank you for the very detailed response!

[–] [email protected] 17 points 1 year ago (1 children)

They're using a third party called deft to manage the hardware. Which is a reasonable middleground between cloud and self-operated, the more I think about it.

I haven't seen a lot of info on what the cost of that management is though but it's likely to be leagues less than AWS/GCP

[–] [email protected] 11 points 1 year ago (1 children)

It’s not just the hardware. “The cloud is expensive” is usually touted by people not understanding why managed services (like Aurora RDS and OpenSearch as suggested in the article) ‘cost more than running it themselves’ by not accounting the management costs.

A database service needs management not only in hardware (I.e. replace dead drives) but also in software (I.e. monitor cluster performance, tweak system settings to fit usage pattern, manage cluster health, etc etc). These management requires time from the ops team, often in multiple roles like SysAdmin, DBA, and Ops engineers. Fact that they claim to have moved to their own hardware without being on new talents to their ops team makes it questionable as to whether or not they actually understand the cost and If they’re overworking their existing ops team.

[–] [email protected] 4 points 1 year ago (1 children)

Or it could be that they haven't run into problems yet. If you overbuild your hardware or your software is efficient enough, you don't need as much tweaking.

It's questionable, but I don't think implausible.

[–] [email protected] 4 points 1 year ago (1 children)

“yet” is the keyword there for sure. It’s not a matter of if, but a matter of when. Even if they’re flushed with cash and grossly over provision their systems, sooner or later, a huge vulnerability will roll around and someone will need to setup / update the OS, ensuring quorum is available for their cluster, fail over traffic during update windows, etc etc etc.

The stacks are getting so insurmountably huge, it’s not possible to just drop a new cluster at their described scale without significantly increasing the workload for an existing team.

[–] [email protected] 3 points 1 year ago

Yup. By moving out, they already let go of a lot of security services that came with their cloud subscription like CASB, automated patching, DB maintenance, security/network monitoring, etc. You have to replace all of that with people and on-prem tools/systems.

[–] [email protected] 6 points 1 year ago

"An entire data center" is 8 rented racks in two enterprise data centers (4 racks in each). They're paying $60K/month for racks, cooling, and location.

[–] [email protected] 6 points 1 year ago (2 children)

Warning. This site claims you've been blocked and asks for your email to verify you. Do not provide it. Reloaded and it worked. Just be safe out there

[–] [email protected] 5 points 1 year ago (1 children)

That isn't happening for me, nor has it ever when I've visited DHH's blog. It's possible your browser is compromised.

[–] [email protected] 2 points 1 year ago (1 children)

I have strong privacy settings enabled. I believe it might be because they can't fingerprint me or similar, so are checking for bot activity

[–] [email protected] 5 points 1 year ago

That seems extremely unlikely, and almost unheard of. If I wget the page I'm a container, I get the same as in browser, so that would suggest this isn't the case.

[–] [email protected] 5 points 1 year ago (1 children)

This didn’t happen for me

[–] [email protected] 3 points 1 year ago

Nor i

[–] [email protected] 31 points 1 year ago (2 children)

That's the thing, 'cloud' is just another tool in your toolbox. It's the right tool for some workloads and the wrong one for others. The fact they've shifted the work to their own servers and kept the ops team suggests it was the wrong sort of workload to be in the cloud in the first place.

For a while there was an obsession with moving everything to the cloud, and that was always going to be an expensive mistake in a number of different ways. Hopefully, as the hype dies down more nuanced decisions will be made. There's a whole gamut of options between all in the cloud and all in the data centre, and when people jump straight from one end to the other I'm put in mind of Hamlet's quote "There are more things in heaven and earth, Horatio, / Than are dreamt of in your philosophy." Understand your workload, understand your business' future plans and their needs, and then make a plan, considering all the tools at your disposal.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago)

If there's anything that 3 decades in Tech have taught me is that fad-following commonly rules it, even with the supposedly logical (but not really) techies.

Cloud storage and cloud computing became a fad about a decade ago (I still remember the hype repeated by people who had never actually designed distruted systems) so there were tons of people jumping headfirst without a plan into it for the hype and the seemingly cheaper price (if you didn't think your needs and future evolution through) even though it wasn't the best choice for them.

No doubt well see the same kind of fad-following over making-sense-for-us thing with the latest hype-train: AI.

[–] [email protected] 5 points 1 year ago (2 children)

I hate the obsession to move to the cloud and the obsession towards serverless or functions.

Functions are stupid and crazy for anything that is actually used often.

For small utilities, they make a ton of sense, but next time I see an app with millions of requests per day using functions, I'm going to lose my mind.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

Years ago I was the senior techie in designing and implementing distributed high performance server systems and what you reminded me of just made my blood start to boil... :/

[–] [email protected] 2 points 1 year ago* (last edited 9 months ago)

[This comment has been deleted by an automated system]

[–] [email protected] 26 points 1 year ago* (last edited 1 year ago) (2 children)

What always kept me off the "cloud" (other people's computers) is not only giving up my data but giving up control on what I spend. Corporations lure you in with flashy promises and low prices, then usually over time the service gets worse the prices go higher and higher. I'm sure the cloud hosting corporations are good at pricing their services very high but not quite high enough to make most customers cancel.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (1 children)

Lock-in is quite an old strategy in Tech (back in the day Microsoft's dominance was built on it) and apparently every new generation needs to learn their lesson...

[–] [email protected] 2 points 1 year ago

That's true, back in the 1970s and 1980s IBM locked companies in with mainframes and PCs were their way out.

load more comments (1 replies)

[–] [email protected] 23 points 1 year ago (2 children)

Exiting cloud being useful seems to be a very narrow use case.

For one, you have to be at a large enough scale where buying and hosting your own infra is feasible and cheaper.

Second, you have to give up the ability to almost instantly scale up or provision hardware in response to traffic or other events. (which is very common at scale)

Maybe his use case happens to be that very narrow case, but this isn't something I would take as general advice.

[–] [email protected] 10 points 1 year ago* (last edited 9 months ago) (1 children)

[This comment has been deleted by an automated system]

[–] [email protected] 3 points 1 year ago (1 children)

Your last paragraph is why we've heavily used the cloud here in rural Canada for years.

Monitoring data is much easier to push into the cloud and read from there than it is hope for a reliable connection to a farm or rural plant.

Self-hosted services need to be cloud hosted for uptime and because it was getting ever harder to get a routed IPv4 address from any provider. IPv6 is nice to finally have, but Starlink is the only provider at all supporting it and it's only been a few months at that. Their prefixes change constantly too, come on guys get your shit together.

Even basic remote access systems require a VPS or VPN cloud service as you always need both ends to punch out through layers of CGNAT. Now we can finally have one end available through IPv6 but the remote user is often trying to use a IPv4 CGNAT network to connect... So you still need something in the cloud to punch holes.

Can't believe it's been over 20 years for the IPv6 rollout

[–] [email protected] 4 points 1 year ago* (last edited 9 months ago) (1 children)

[This comment has been deleted by an automated system]

[–] [email protected] 1 points 1 year ago (1 children)

I'm still trying to figure out how to use Docker with an unstable prefix (hey Docker, this is as much your problem as the ISPs, honestly) as any of the v6NAT solutions I've found that enable the same full containerization available on IPv4 all require you feed the Docker daemon a fixed prefix on startup. Frustrating.

I'm also tired of reading posts about v6NAT being irrelevant because half of the point of containers is the interchangeability, Docker containers aren't supposed to be routable unless you intentionally put them on the host network! Docker just needs to work the same on v4 and v6!

Tor as a hole puncher is an intriguing idea but I don't think I would use it for something customer facing... Too many moving parts. We like to use Wireguard and a tiny cloud VPS instance when someone needs to punch into an unreliable network around here.

[–] [email protected] 1 points 1 year ago* (last edited 9 months ago)

[This comment has been deleted by an automated system]

[–] [email protected] 8 points 1 year ago

DHH is a contrarian. Any benefits of the cloud he might get are overridden by the fact that he needs to be different (and blog about it).

See his stances on Typescript, workplace inclusion, TDD, etc.

[–] [email protected] 22 points 1 year ago (2 children)

This is quite intriguing. But DHH has left so many details out (at least in that post) as pointed out by @[email protected] - it makes it difficult to relate to.

On the other hand, like DHH said, one's mileage may vary: it's, in many ways, a case-by-case analysis that companies should do.

I know many businesses shrink the OPs team and hire less experienced OPs people to save $$$. But just to forward those saved $$$ to cloud providers. I can only assume DDH's team is comprised of a bunch of experienced well-payed OPs people who can pull such feats off.

Nonetheless, looking forward to, hopefully, a follow up post that lays out some more details. Pray share if you come across it 🙏

[–] [email protected] 6 points 1 year ago

This is part of a series of posts he has done about find out his cloud bill was stupid high because they do computationally heavy software and switching over to collocation. But the whole going from 100% cloud to colo and saving that much money is not to be scoffed at.

He does say this is an outlier and others won't get as much roi as they have.

[–] [email protected] 2 points 1 year ago

there are a number of blog posts that have different details about the how/why, etc. i just followed the links in the article to other parts of the series.

I expect that the use case is more prevalent than you think, where you are spending a decent chunk on cloud infra. I have been convinced for some time now that the costs are high compared to our on-prem. I really like the idea of a the "deft" type hardware management service, so that look after the DCs, hardware and connectivity, and we look after the software.

[–] [email protected] 9 points 1 year ago (3 children)

Hopefully, they place their servers at 2x the historical peak floodpoint. Or set up standby zones in different geographies in case there's a power or network outage.

Came upon several projects where folks hadn't...

[–] [email protected] 7 points 1 year ago (2 children)

Having your compute in "the cloud" doesn't remove the need for a good backup strategy, it just changes how it works. Yes, disaster recover for natural disasters should be easier (OHV's fire showed that this may not always be true). But, that doesn't cover cases like ransomware, insider threats, data mistakes or any other case where data is corrupted/modified by mistake. You still need a plan for these cases. And cloud based backups actually make a lot of sense.

But, just because you put your backups in the cloud, doesn't mean that your compute should be there as well. There is an advantage that your Time to Recovery is likely lower with both backups and compute in the same cloud. But, is that worth the ongoing cost of running your compute in the cloud? That needs to be considered separately. You also need to consider the cost of running on-prem versus in the cloud. If you have fairly predictable, static loads, it may be cheaper to buy and run servers yourself. For hard to predict, elastic loads, cloud may make more financial sense.

As others have said before, there was a period where companies were just going to the cloud for the sole reason that it was the popular thing to do. For some it actually made financial sense. For some, it didn't. The OP's article seems to be the latter.

[–] [email protected] 2 points 1 year ago

Exactly. Use cloud for off-site backup and things that need flexibility.

You don't need any of that to run a basic website. You can almost use an old laptop or PC for most static applications.

load more comments (1 replies)

[–] [email protected] 2 points 1 year ago (2 children)

So how then people using this *miraculous and incredibly safe * (/s) cloud lost their data in OVH datacenter fire?

[–] [email protected] 3 points 1 year ago (1 children)

They used the cheap option without geographic mirrors.

[–] [email protected] 1 points 1 year ago (2 children)

So you say that if you don't make an additional investment in backup infrastructure your data is at risk... Sounds pretty similar to self-hosting, doesn't it?

[–] [email protected] 2 points 1 year ago (1 children)

More like the "cloud" provider should have multiple locations and redundancy in place.

load more comments (1 replies)

[–] [email protected] 9 points 1 year ago (1 children)

oh god, its dhh

load more comments (1 replies)

[–] [email protected] 8 points 1 year ago

I've been saying this for a long time.

There are use cases for the cloud. I put e-mail in the cloud- ain't nobody got time to deal with providing reliable SMTP or Exchange while keeping spam out. If you have a web app that needs to scale quickly, cloud's the way. If you're a startup with limited capital and you don't want to blow it on a bunch of servers when you're not sure if you'll survive more than a year or so, cloud's the way.

But Cloud ISN'T the end-all answer for everything.

If you have a predictable workload, especially one that relies on more expensive cloud services, de-clouding can save you a bundle. Buying hardware can be cheaper than renting it, if only because (think about it) the cloud provider has to buy the same hardware and rent it to you AND make a profit. If you're going to be around a while, and you expect to use a piece of hardware for its full service life, that makes a lot of sense.

[–] [email protected] 8 points 1 year ago (2 children)

Yeah ok well when you get ransomware'd you're going to wish you had Cloud backups.

Ask me how I know

[–] [email protected] 35 points 1 year ago (2 children)

There are also many organizations that wish they has some local backups after their cloud service providers lost all their data. Lesson to learn: Backup properly with offline storage. Tape in a safe, maybe even off-site, etc.

[–] [email protected] 9 points 1 year ago

So, what you're saying is that, regardless of where you run your workloads, you should still follow the 3-2-1 rule?

3 - copies of the data. 2 - different media. 1 - offsite.

It's funny how cloud doesn't really change the basics of good systems administration.

[–] [email protected] 6 points 1 year ago

Almost like a responsible modern day approach is multifaceted

[–] [email protected] 4 points 1 year ago (1 children)

How do you know?

[–] [email protected] 2 points 1 year ago

An earwig told me

[–] [email protected] 2 points 1 year ago

As long as you realize that the "cloud" is someone else's computer, it is a very viable way of hosting your service. However as your service grows all those micro services that your cloud provider charges you for will grow as well. Eventually you'll get to the point where "data transfer" costs begins to make up >50% of your total cloud spend. At that point (or ideally before) you should have a plan to stop expanding your cloud footprint, because that cost grows geometrically with the size of your cloud data and the number of cloud functions you are using on your data.

Remember Data has Weight. If you don't understand what that means, you aren't ready to make a cost comparison between cloud-hosting and data center hosting.

Technology