So here's another couple of ideas. RHDN can use one of those bot screener authentication systems. You know the ones that show you a picture and you have to input the proper info about the picture to access the download?
I have been thinking about a better long term solution of our own. We don't really need to block all bots. It's far easier to prevent these malware scanning bots than it is to say block spambots (or even worse paid spam workers). Since Google nor several of these others scanning services would ever be bothered to actually implement custom code for our site, they would potentially be defeatable with a page where you just copy and paste a given generated password right next to it. I doubt Google would ever add several lines of code specifically to automate this on RHDN, just to scan our files...
So, I think we can roll our own simple solution with minimal resource usage, and minimal inconvenience. This would need to be done in conjunction with accounts and sessions. So, logged in people would be exempt from the test, and non-logged people would require the test only once per browser session.
Anyway, I will continue to explore this avenue when time allows.
There is nothing we can do about Google. Here at RHDN however, there is an equally effective and simpler solution than the one currently in place: http://www.captcha.net/
Oh, the irony! The reCAPTCHA project is a Google project, you know. Although they currently claim to respect whether or not you choose to allow googlebots past it, I'm not sure that applies to their safe browsing bots. Regardless, they certainly have the abilities to bypass it when/if they want.