flydev Posted January 27, 2018 Posted January 27, 2018 (edited) Presentation Originaly developped by Jeff Starr, Blackhole is a security plugin which trap bad bots, crawlers and spiders in a virtual black hole. Once the bots (or any virtual user!) visit the black hole page, they are blocked and denied access for your entire site. This helps to keep nonsense spammers, scrapers, scanners, and other malicious hacking tools away from your site, so you can save precious server resources and bandwith for your good visitors. How It Works You add a rule to your robots.txt that instructs bots to stay away. Good bots will obey the rule, but bad bots will ignore it and follow the link... right into the black hole trap. Once trapped, bad bots are blocked and denied access to your entire site. The main benefits of Blackhole include: Quote Stops leeches, scanners, and spammers Saves server resources for humans and good bots Improves traffic quality and overall site security Bots have one chance to obey your site’s robots.txt rules. Failure to comply results in immediate banishment. Features Disable Blackhole for logged in users Optionally redirect all logged-in users Send alert email message Customize email message Choose a custom warning message for bad bots Show a WHOIS Lookup informations Choose a custom blocked message for bad bots Choose a custom HTTP Status Code for blocked bots Choose which bots are whitelisted or not Instructions Install the module Create a new page and assign to this page the template "blackhole" Create a new template file "blackhole.php" and call the module $modules->get('Blackhole')->blackhole(); Add the rule to your robot.txt Call the module from your home.php template $modules->get('Blackhole')->blackhole(); Bye bye bad bots! Downloads https://github.com/flydev-fr/Blackhole http://modules.processwire.com/modules/blackhole/ Screen Enjoy Edited March 20, 2018 by flydev module directory link 17 1
szabesz Posted January 27, 2018 Posted January 27, 2018 "Sounds" useful Thanks for sharing and caring! 2
Juergen Posted January 27, 2018 Posted January 27, 2018 20 hours ago, flydev said: https://github.com/flydev-fr/Blackhole This link under the download section leads to this Processwire page and not to Github because the href value is empty. 2
Juergen Posted January 28, 2018 Posted January 28, 2018 Hi @flydev I always get this warnings with Tracy: Best regards
dragan Posted January 28, 2018 Posted January 28, 2018 Nice module, thanks for sharing. I wonder though how effective it really is reading the last two sections "caveat emptor" and "blackhole whitelist": https://perishablepress.com/blackhole-bad-bots/#blackhole-whitelist Quote Whitelisting these user agents ensures that anything claiming to be a major search engine is allowed open access. The downside is that user-agent strings are easily spoofed, so a bad bot could crawl along and say, “Hey look, I’m teh Googlebot!” and the whitelist would grant access. It is possible to verify the true identity of each bot, but doing so consumes significant resources and could overload the server. Avoiding that scenario, the Blackhole errs on the side of caution: it’s better to allow a few spoofs than to block any of the major search engines. 3
horst Posted January 29, 2018 Posted January 29, 2018 To get a "quote" how useful it maybe for a specific site, log all (search bots) user agents for a while. 2
flydev Posted January 30, 2018 Author Posted January 30, 2018 @dragan As @horst said, you can check your logs for bots. There are a small tips in the module admin for that : @Juergen what say <?php echo ini_get('allow_url_fopen'); ?> ? Sorry I don't understand what is saying the warning thing in German 2
Juergen Posted January 30, 2018 Posted January 30, 2018 @flydev It means "The waiting time for the connection has expired" 1
flydev Posted January 30, 2018 Author Posted January 30, 2018 Ok thanks, then probably a firewall issue. Which type of webhosting you are trying the module on ? 1
Juergen Posted January 30, 2018 Posted January 30, 2018 Its a shared host. 9 hours ago, flydev said: what say <?php echo ini_get('allow_url_fopen'); ?> ? It says true. For the moment I have disabled this module because the loading time of the page increases significantly.
flydev Posted January 30, 2018 Author Posted January 30, 2018 1 hour ago, Juergen said: For the moment I have disabled this module because the loading time of the page increases significantly. You can disable the WHOIS lookup in the module's config. 2
Juergen Posted March 16, 2018 Posted March 16, 2018 I have installed it again but now I have only included the module in the blackhole.php (not on the home or other page) only to see if it works. It works now, but the loading time of the page is approx. 21 seconds!!!! I have added a hidden link in my site to the blackhole.php and if I click on it my IP will be stored in the DAT file - works well. In the mail that I got afterwards there was a hint about a Port problem: Whois Lookup: Timed-out connecting to $server (port 43). I am on a shared host so it seems that this port is not free. The strange thing is that I have disabled the Who is Lookup in my settings of the module Best regards Jürgen 1
flydev Posted March 16, 2018 Author Posted March 16, 2018 Thanks you @Juergen . About the port 43, its common that this port is blocked by default and - depending on the hosting provider - can be configured trough the panel provided. 59 minutes ago, Juergen said: The strange thing is that I have disabled the Who is Lookup in my settings of the module Will look at it this afternoon as I am deploying this module a on a production site. Stay tuned, thanks again mate. 1
flydev Posted March 20, 2018 Author Posted March 20, 2018 Module updated to version 1.0.2. The Whois information request is triggered accordingly to the module's option Thanks for the bug report @Juergen 2
Juergen Posted March 20, 2018 Posted March 20, 2018 (edited) Works like a charm now! Would be great if the hard coded url of the "contact the administrator" page could be selected out of PW pages. Thanks for the update!!! Edit: It would be better if you add multilanguage support to the custom message textareas Edited March 20, 2018 by Juergen 1
flydev Posted March 20, 2018 Author Posted March 20, 2018 2 hours ago, Juergen said: It would be better if you add multilanguage support to the custom message textareas I will try to do it, I never played with modules and multilanguage 1
Juergen Posted March 20, 2018 Posted March 20, 2018 3 hours ago, flydev said: I will try to do it, I never played with modules and multilanguage Its not so important, because only bad bots will see it and probably no humans (I hope so). By the way 2 bots from China were caught in the trap - works!!! 1 1
flydev Posted March 21, 2018 Author Posted March 21, 2018 Good and funny ! 13 hours ago, Juergen said: because only bad bots will see it and probably no humans For example, on the site I deployed the module, it is a custom dashboard with sensible informations, I had to take care of hand crafted request which could retrieve data from other users. When this behavior is detected, the user is logged out, the role login-disabled is assigned and then the user is redirected into the blackhole to be banned. public function SecureParks() { if($this->input->post->park) { $ids = explode('-', $this->sanitizer->pageName($this->input->post->park)); $userroles = $this->getParkRoles(); $userhaveright = $this->searchForParkId($ids[2], $userroles); if ($userhaveright === null) { $this->user->addRole('login-disabled'); $this->user->save(); $this->session->logout(); $this->session->redirect($this->pages->get('/blackhole/')->url); // :) } } } 2
Juergen Posted March 31, 2018 Posted March 31, 2018 Just a thought: I think it would be nice to store the banned IPs also in a logfile, so you have them in one place with the other protocols. Fe: $log->save('blackhole', 'Banned IP') You can also add fe a checkbox in the module settings to offer enabling and disabling of this feature. What do you think? Might this be useful for others too? 2
flydev Posted March 31, 2018 Author Posted March 31, 2018 Hi @Juergen I completely agree. Even better, there will be a Process module to manage/view the blackhole data. 1
flydev Posted March 31, 2018 Author Posted March 31, 2018 I was also thinking to add a new feature from where we could monitor 302/404 HTTP code and redirect the "guest" into the blackhole. For example, all those try : /phpMyAdmin/scripts/_setup.php /w00tw00t.at.ISC.SANS.DFind:) /blog/wp-login.php /wp-login.php etc. will be banned. I still don't know if I code all the feature or if I should hook into Jumplinks from @Mike Rockett.
Juergen Posted March 31, 2018 Posted March 31, 2018 3 minutes ago, flydev said: /blog/wp-login.php /wp-login.php I also have a lot of these requests in my 404 logger protocol . I think if there is module that can handle it - use it. Check if the module is installed first. If not output a message that this feature is only available if Jumplinks is installed. I dont have Jumplinks installed and I dont know how well it works, but before starting to code from the beginning I would try to use an existing solution first.
horst Posted March 31, 2018 Posted March 31, 2018 (edited) I use Redirect gone ... in .htaccess Redirect gone /wp-login.php for all that stuff. (First I log 404s for a period, than I add those candidates to the .htaccess, before ProcessWires entries!!) I think it is better to not invoke PW for this stuff, (lesser overhead on the server!), instead use apache custom error page(s). 47ms is fast! PS: 410 is better than 404, as I also use this for SearchEngineRequests that try to reach URLs that do not exist since 10 years or so. Normally the SEs should flush their cache on 410 returns. Edited March 31, 2018 by horst 2 1
Mike Rockett Posted March 31, 2018 Posted March 31, 2018 In all honesty, I think that Jumplinks is better suited to site migrations. Black holes should either be covered by a specifically-built module, or by htaccess/vhost config... 1 1
flydev Posted April 1, 2018 Author Posted April 1, 2018 Ok guys, I get what you mean, so what about a module with this flow ? monitor and log HTTP error code for a period if an entry / request is superior of N then backup .htaccess file (versioning it) add new entries to the .htaccess file Does it make sense or I should let the user manage their .htaccess file manually with a FAQ or something ? 2
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now