this post was submitted on 07 Jul 2024
239 points (92.2% liked)
Technology
59374 readers
3392 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
there are too many points of failure for me to ever be comfortable using the cloud as a primary storage option.
i've always maintained this opinion when "the cloud" started being touted as being the future. and yet more corporations (including mine) are reliant on it. i mean sure, i can log in on my home computer and have some access to stuff as though i were physically at the office but that convenience ain't worth the headache if the main storage site crashes.
Having done everything from building my own servers 30 years ago to managing hundreds of servers in data centers to now managing hundreds of instances and other services in AWS, I’ll gladly stick with AWS. The hardware management alone makes it well worth the overhead.
25 or so years ago I had to troubleshoot a hardware issue in a SCSI-based server with 6 hard drives in it. A drive appeared to be failing so I replaced it and immediately another drive failed, then another, and so on. After almost a full day of troubleshooting later and we realized the power supply was actually the culprit and could no longer provide sufficient power to the full set of hard drives.
20 years ago while managing 700+ servers in a datacenter we had to manage a recall of about 400 of them thanks to the Capacitor plague that caused a handful of our servers to literally burst into flames.
Hardware failures like the above and dozens of others were mitigated in most cases thanks to redundancies in the software we wrote. But dealing with hardware failures and the resulting software recovery was a real PITA.
With AWS I may occasionally have a Linux instance lock up due to a hardware failure but it’s usually fairly easy to reboot the instance and have it migrate to new hardware. It’s also trivial to migrate a server to run on more (or less) number of CPU’s, RAM, etc. with only a couple of minutes of downtime.
The more advanced services AWS offers like object storage, queues, databases, etc. are even more resilient. We occasionally get notified that a replica for one of these services had failed or was determined to be on hardware that was failing, and it was automatically replaced with a new replica.
I’d much rather work this way than the way I did 20+ years ago.
Why not outsourcing just the hardware then? Dedicated servers and Kubernetes slapped on them. Hardware failure mitigated for the most part, and the full effort goes into making the cluster as resilient as possible, for 1/5 of the cost of AWS. If machines burn, it's not your problem (you can have them spread over multiple sites, DCs, rooms, racks) anymore.
We did that (with Rackspace) for years before migrating to AWS. AWS is still far better from a service & flexibility perspective.
My employers website has certain times of the year where we see a huge increase in web traffic. When we had a hosted solution it took weeks of preparation to provision additional web servers to handle that load. We had to submit formal requests for additional servers, document how to wire them into our network & required firewall rules, etc. Then we had to wait an arbitrary number of days for them to do the work. And then we had to repeat that whole process when we no longer needed the additional capacity.
With AWS we just define an auto scaling group and additional web servers are spun up automatically when demand is high, and frees them up again when no longer needed. Even if we didn’t use auto scaling we could easily automate this sort of thing via terraform or other tools and spin up additional instances in minutes instead of days.