Title is imprecise, it's Archiveteam.org, not Archive.org. The Internet Archive is providing free hosting, but the archival work was done by Archiveteam members.
Related. Others?
Enlisting in the Fight Against Link Rot - https://news.ycombinator.com/item?id=44877021 - Aug 2025 (107 comments)
Google shifts goo.gl policy: Inactive links deactivated, active links preserved - https://news.ycombinator.com/item?id=44759918 - Aug 2025 (190 comments)
Google's shortened goo.gl links will stop working next month - https://news.ycombinator.com/item?id=44683481 - July 2025 (222 comments)
Google URL Shortener links will no longer be available - https://news.ycombinator.com/item?id=40998549 - July 2024 (49 comments)
Ask HN: Google is sunsetting goo.gl on 3/30. What will be your URL shortener? - https://news.ycombinator.com/item?id=19385433 - March 2019 (14 comments)
Tell HN: Goo.gl (Google link Shortener) is shutting down - https://news.ycombinator.com/item?id=16902752 - April 2018 (45 comments)
Google is shutting down its goo.gl URL shortening service - https://news.ycombinator.com/item?id=16722817 - March 2018 (56 comments)
Transitioning Google URL Shortener to Firebase Dynamic Links - https://news.ycombinator.com/item?id=16719272 - March 2018 (53 comments)
Can we build a blockchain/P2P-based web crawler that can create snapshots of the entire web with high integrity (peer verification)? The already-crawled pages would be exchanged through bulk transfer between peers. This would mean there is an "official" source of all web data. LLM people can use snapshots of this. This would hopefully reduce the amount of ill-behaved crawlers, so we will see less draconian anti-bot measures over time on websites, in turn making it easier to crawl. Does something like this exist? It would be so awesome. It would also allow people to run a search engine at home.
Recent update from Google: https://blog.google/technology/developers/googl-link-shorten...
I don't understand the page, it shows a list of data sets (I think?) up to 91 TiB in size
The list of short links and their target URLs can't be 91 TiB in size can it? Does anyone know how this works?
Is there anyone archiving all of reddit? Or twitter? I mean even if their terms have changed to not allow it.
Does "all" mean all the URLs publicly known, or did they exhaustively iterate the entire URL namespace?
Glad I contributed to this in some small way.
Happy go have contributed a hundred thousand links by running their docker container!
I wonder how many of them lead to private YouTube videos, Google documents, etc.
Google said they would keep hosting any recently-clicked link; does this mean that all the links are now recently-clicked?
Why? Did they ask anyone if it was okay? Anything sensitive at those links? Anything at those links people didn't want or need anymore? Maybe people thought those links were dead? Did Google provide a way to cancel those links first?
It's like when the GPT links were archived and publicly available that contained sensitive information.
Hell yeah!!! Fantastic work, everyone!
Gamefaqs remains unarchived.
Ok how do I access them, or is that not the point?
Excellent! ArchiveTeam have always been impressive this way. Some years ago, I was working at a video platform that had just announced it would be shutting down fairly soon. I forget how, but one way or another I got connected with someone at ArchiveTeam who expressed their interest in archiving it all before it was too late. Believing this to be a good idea, I gave them a couple of tips about where some of our device-sniffing server endpoints were likely to give them a little trouble, and temporarily "donated" a couple EC2 instances to them to put towards their archiving tasks.
Since the servers were mine, I could see what was happening, and I was very impressed. Within I want to say two minutes, the instances had been fully provisioned and were actively archiving videos as fast as was possible, fully saturating the connection, with each instance knowing to only grab videos the other instances had not already gotten. Basically they have always struck me as not only having a solid mission, but also being ultra-efficient in how they carry it out.