Messing with Scraper Bots
What about using zip bombs?
Neat! Most of the offensive scrapers I met try and exploit WordPress sites (hence the focus on PHP). They don't want to see php files, but their outputs.
What you have here is quite close to a honeypot, sadly I don't see an easy way to counter-abuse such bots. If the attack is not following their script, they move on.
Interesting! It's nice to see people are experimenting with these, and I wonder if this kind of junk data generators will become its own product. Or maybe at least a feature/integration in existing software. I could see it going there.
Hm.. why not using dumbed down small, self-hosted LLM networks to feet the big scrapers with bullshit?
I'd sacrifice two CPU cores for this just to make their life awful.
[dead]
If you control your own Apache server and just want to shortcut to "go away" instead of feeding scrapers, the RewriteEngine is your friend, for example:
Notes: there's no PHP on my servers, so if someone asks for it, they are one of the "bad boys" IMHO. Your mileage may differ.