Just a thought - if you design a system to prevent AI crawlers, instead of booting them off, serve crypto-mining JavaScript instead. It would be very funny.
Just a thought - if you design a system to prevent AI crawlers, instead of booting them off, serve crypto-mining JavaScript instead. It would be very funny.
Hmm, how would you convince the crawler to run your code on its home system, rather than just scraping data?
Isn’t that what Anubis was doing? Making it run code so it wasn’t worthwhile, but people adjusted AI crawlers to run code?
“Proof of work”. The AI crawlers don’t run Javascript (yet, I don’t think), so it’s basically a firewall to them.
I’m fairly sure Anubis was made because some crawlers did run JavaScript
Some can from what i understood
And not only JS but other code too like SQL
I remember the somewhat recent case where someone vibecoded something and the AI viped the database
That’s a local AI agent not an online crawler
There’s a functional difference between forcing a crawler to interact with code on your server that wastes its time, and getting it to download your code and run it on its own server - the issue being where the actual CPU/GPU/APU cycles happen. If they happen on your server then it’s not benefiting you at all, it’s costing you the same amount as just running the cryptominer directly would.
Any halfway intelligent administrator would never allow an automated routine to download and run arbitrary code on their own system, it would be a massive security risk.
My understanding of Anubis is that it just leads the crawler into a never-ending cycle of URLs that just lead to more URLs while containing no information of any value. The code that does this is still installed and running on your server, and is just serving bogus links to the crawler.
That’s not how Anubis works. You’re likely thinking of Nepenthes
“would never allow an automated routine to download arbitraru code” javascript and wasm being the leading tech to do exactly this. Make those essential for loading content and bypassing it would have to be bespoke solutions depending on the framework and implementations.
Maybe design kind of a captcha task for them?
https://neal.fun/not-a-robot/
Apparently I have no idea what a vegetable is
I think Neal has no idea
Yeah, I quit the stupid game when I correctly selected all the vegetables and it told me I was wrong
If you selected tomatoes, that is a fruit.
I am aware
If you install a captcha as part of your web server, that code is running on your server.
The crawler interacting with the captcha on your server will not result in cryptominer code running on its server.
Something on the crawler’s server would need to accept a download of the cryptominer code and then run that code.
True, but it’s more about solving the captcha as in finding its solution. However, there is no solution, but only a never ending task of calculation (the mining, which the crawler but will need to do). Of course this is highly hypothetical as I do not know anything about cryptomining (and I also don’t want to know more about it).
Without getting into the technical details, the main cost offset of running a cryptominer is the electricity used. If the crawler performs cryptominer calculations on your server it will be of no benefit to you, because you will still have to pay the electricity bill, and really it’s not the crawler doing the calculations, it’s your own server hardware.
If it’s keeping the crawlers at bay at the same time, though, couldn’t the differential brought in by the mining represent a cost savings? This question is breaking my brain, maybe I’m not thinking about it properly.