this post was submitted on 18 May 2024
580 points (98.8% liked)

Technology

59148 readers
2773 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 64 points 5 months ago (5 children)

In most cases, this is because an individual page was deleted or removed on an otherwise functional website.

How is this news? I bet a lot of pages were also added in the same time frame, very likely orders of magnitude more.

[–] [email protected] 49 points 5 months ago (2 children)

I’ve heard the early Internet age referred to as the future dark ages. When all the work, information and content is digitized, it’s prone to being lost to history forever.

[–] [email protected] 15 points 5 months ago* (last edited 5 months ago)

Early Internet - yes, but then there's the middle Internet (or the high Internet if you like, like high Middle Ages) which was in large part scraped by archive.org, and also people generally still knew about offline backups in both eras, and then there's the late Internet, which moved to siloed services and at the same time most people using it were and are oblivious about preserving data elsewhere. That's the worst one.

[–] [email protected] 5 points 5 months ago

My partner works in historical archiving for science and medicine. Museum work, basically. He's told me so much of the archives are donated collections of notes, letters, journals, and so on from important doctors, researchers, scientists, etc. Donated by the subject themselves in their later years or by their families.

He's told me there is a growing issue with those people starting to donate entirely digital collections, but even worse than that, are all the documents that are not being stored on a physical hard drive, but on web services and clouds. By the time these people are willing to start donating their things, so much of it has just been deleted forever without them realizing it. Or worse, they die, and their families no longer have access.

Working in IT, I told him about Microsoft's growing push to eliminate Outlook and PST files, make it all web based email, and he wasn't surprised, but he was still bummed to hear it. Apparently a not insignificant amount of those donations are locally stored emails.

[–] [email protected] 33 points 5 months ago (2 children)

Because those pages had information that wasn't on the new pages?

Just from my own experience, WotC migrated the Magic the Gathering site to a new one, and while some articles were brought over there were a whole lot of stories, strategies and event coverage that were lost or are only available thanks to Archive.org

[–] [email protected] 24 points 5 months ago

I ran across software once that wouldn't compile properly and the only documentation available was an archive.org hosted backup of an Intel help page that no longer exists. There is no alternative, Intel just removed it entirely.

[–] [email protected] 1 points 5 months ago

Yes. The whole post is a trick with statistics. Web pages have a limited lifespan. You can do the aame trick with human life spans.

"50 % of humans that lived 60 years ago are now dead". You would tweak the numbers to be factual but something like that makes sense to me.

If you only keep the samples you started out with, of course it's going to decline over time. The data is guaranteed to not grow since nothing is ever added.

[–] [email protected] 17 points 5 months ago (1 children)

I bet a lot of pages were also added in the same time frame, very likely orders of magnitude more.

No. What you'd make a page for in the 00s, you'd create a FB group or something in the 10s. Hostage to corps and probably too removed for whatever reason.

[–] [email protected] 5 points 5 months ago

And not indexed due to crawling bot-decisions

[–] [email protected] 7 points 5 months ago

Sure a lot of pages of ad infested ads replaced human produced content.

[–] [email protected] -3 points 5 months ago

And those added pages were probably just as worthless as the ones they replaced.