Recommended Posts

Presentation
Originaly developped by Jeff Starr, Blackhole is a security plugin which trap bad bots, crawlers and spiders in a virtual black hole.
Once the bots (or any virtual user!) visit the black hole page, they are blocked and denied access for your entire site.
This helps to keep nonsense spammers, scrapers, scanners, and other malicious hacking tools away from your site, so you can save precious server resources and bandwith for your good visitors.

 

How It Works
You add a rule to your robots.txt that instructs bots to stay away. Good bots will obey the rule, but bad bots will ignore it and follow the link... right into the black hole trap. Once trapped, bad bots are blocked and denied access to your entire site.


The main benefits of Blackhole include:

Quote

Stops leeches, scanners, and spammers
Saves server resources for humans and good bots
Improves traffic quality and overall site security

 Bots have one chance to obey your site’s robots.txt rules. Failure to comply results in immediate banishment.

 

Features

  • Disable Blackhole for logged in users
  • Optionally redirect all logged-in users
  • Send alert email message
  • Customize email message
  • Choose a custom warning message for bad bots
  • Show a WHOIS Lookup informations
  • Choose a custom blocked message for bad bots
  • Choose a custom HTTP Status Code for blocked bots
  • Choose which bots are whitelisted or not

 
Instructions

  1. Install the module
  2. Create a new page and assign to this page the template "blackhole"
  3. Create a new template file "blackhole.php" and call the module $modules->get('Blackhole')->blackhole();
  4. Add the rule to your robot.txt
  5. Call the module from your home.php template $modules->get('Blackhole')->blackhole();

 Bye bye bad bots!


Downloads

 

Screen

blackhole.gif.8360604767dfcff7430cf4d317a11b94.gif

 


 Enjoy :neckbeard:

Edited by flydev
module directory link
  • Like 15

Share this post


Link to post
Share on other sites

"Sounds" useful :) Thanks for sharing and caring!

  • Like 2

Share this post


Link to post
Share on other sites

Nice module, thanks for sharing.

I wonder though how effective it really is reading the last two sections "caveat emptor" and "blackhole whitelist":

https://perishablepress.com/blackhole-bad-bots/#blackhole-whitelist

Quote

Whitelisting these user agents ensures that anything claiming to be a major search engine is allowed open access. The downside is that user-agent strings are easily spoofed, so a bad bot could crawl along and say, “Hey look, I’m teh Googlebot!” and the whitelist would grant access. It is possible to verify the true identity of each bot, but doing so consumes significant resources and could overload the server. Avoiding that scenario, the Blackhole errs on the side of caution: it’s better to allow a few spoofs than to block any of the major search engines.

 

  • Like 3

Share this post


Link to post
Share on other sites

To get a "quote" how useful it maybe for a specific site, log all (search bots) user agents for a while. 

  • Like 2

Share this post


Link to post
Share on other sites

@dragan  As @horst said, you can check your logs for bots. There are a small tips in the module admin for that :

 

help.thumb.png.e1025629623879550a91479df6092086.png

 

@Juergen  what say <?php echo ini_get('allow_url_fopen'); ?> ?

Sorry I don't understand what is saying the warning thing in German :lol:

  • Like 2

Share this post


Link to post
Share on other sites

Ok thanks, then probably a firewall issue. Which type of webhosting you are trying the module on ?

  • Like 1

Share this post


Link to post
Share on other sites

Its a shared host.

9 hours ago, flydev said:

what say <?php echo ini_get('allow_url_fopen'); ?> ?

It says true.

For the moment I have disabled this module because the loading time of the page increases significantly.

Share this post


Link to post
Share on other sites
1 hour ago, Juergen said:

For the moment I have disabled this module because the loading time of the page increases significantly.

You can disable the WHOIS lookup in the module's config.

 

whois.thumb.png.6bad2c335a8ab2d189d46a4db0f339b8.png

  • Like 2

Share this post


Link to post
Share on other sites

I have installed it again but now I have only included the module in the blackhole.php (not on the home or other page) only to see if it works. It works now, but the loading time of the page is approx. 21 seconds!!!!

I have added a hidden link in my site to the blackhole.php and if I click on it my IP will be stored in the DAT file - works well. In the mail that I got afterwards there was a hint about a Port problem:

Whois Lookup:

Timed-out connecting to $server (port 43).

I am on a shared host so it seems that this port is not free. The strange thing is that I have disabled the Who is Lookup in my settings of the module

Screenshot(8).png.9ef7ce303425b5bc8ff54ec9cdf2ba76.png

Best regards Jürgen

  • Thanks 1

Share this post


Link to post
Share on other sites

Thanks you @Juergen .

About the port 43, its common that this port is blocked by default and - depending on the hosting provider - can be configured trough the panel provided.

59 minutes ago, Juergen said:

The strange thing is that I have disabled the Who is Lookup in my settings of the module

Will look at it this afternoon as I am deploying this module a on a production site. Stay tuned, thanks again mate.

  • Like 1

Share this post


Link to post
Share on other sites

Module updated to version 1.0.2.

  • The Whois information request is triggered accordingly to the module's option

 

Thanks for the bug report @Juergen :)

 

  • Like 2

Share this post


Link to post
Share on other sites
Posted (edited)

Works like a charm now! Would be great if the hard coded url of the "contact the administrator" page could be selected out of PW pages.

Thanks for the update!!!

Edit: It would be better if you add multilanguage support to the custom message textareas :)

 

 

Edited by Juergen
  • Like 1

Share this post


Link to post
Share on other sites
2 hours ago, Juergen said:

It would be better if you add multilanguage support to the custom message textareas :)

I will try to do it, I never played with modules and multilanguage ;)

  • Like 1

Share this post


Link to post
Share on other sites
3 hours ago, flydev said:

I will try to do it, I never played with modules and multilanguage

Its not so important, because only bad bots will see it and probably no humans (I hope so). By the way 2 bots from China were caught in the trap - works!!!:)

  • Like 1
  • Haha 1

Share this post


Link to post
Share on other sites

Good and funny !

 

13 hours ago, Juergen said:

because only bad bots will see it and probably no humans

For example, on the site I deployed the module, it is a custom dashboard with sensible informations, I had to take care of hand crafted request which could retrieve data from other users. When this behavior is detected, the user is logged out, the role login-disabled is assigned and then the user is redirected into the blackhole to be banned.

 

public function SecureParks() {
        if($this->input->post->park) {
            $ids = explode('-', $this->sanitizer->pageName($this->input->post->park));
            $userroles = $this->getParkRoles();
            $userhaveright = $this->searchForParkId($ids[2], $userroles);
            if ($userhaveright === null) {
                $this->user->addRole('login-disabled');
                $this->user->save();
                $this->session->logout();
                $this->session->redirect($this->pages->get('/blackhole/')->url); // :)
            }
        }
    }

 

  • Like 2

Share this post


Link to post
Share on other sites

Just a thought:

I think it would be nice to store the banned IPs also in a logfile, so you have them in one place with the other protocols.

Fe:

$log->save('blackhole', 'Banned IP')

You can also add fe a checkbox in the module settings to offer enabling and disabling of this feature.

What do you think? Might this be useful for others too?

  • Like 2

Share this post


Link to post
Share on other sites

Hi @Juergen

I completely agree.  Even better, there will be a Process module to manage/view the blackhole data.

  • Like 1

Share this post


Link to post
Share on other sites

I was also thinking to add a new feature from where we could monitor 302/404 HTTP code and redirect the "guest" into the blackhole.

For example, all those try :

  • /phpMyAdmin/scripts/_setup.php
  • /w00tw00t.at.ISC.SANS.DFind:)
  • /blog/wp-login.php
  • /wp-login.php
  • etc.

will be banned.

I still don't know if I code all the feature or if I should hook into Jumplinks from @Mike Rockett.

Share this post


Link to post
Share on other sites
3 minutes ago, flydev said:
  • /blog/wp-login.php
  • /wp-login.php

I also have a lot of these requests in my 404 logger protocol :(.

I think if there is module that can handle it - use it.  Check if the module is installed first. If not output a message that this feature is only available if Jumplinks is installed.

I dont have Jumplinks installed and I dont know how well it works, but before starting to code from the beginning I would try to use an existing solution first.

Share this post


Link to post
Share on other sites
Posted (edited)

I use Redirect gone ... in .htaccess

Redirect gone /wp-login.php

for all that stuff. (First I log 404s for a period, than I add those candidates to the .htaccess, before ProcessWires entries!!)

I think it is better to not invoke PW for this stuff, (lesser overhead on the server!), instead use apache custom error page(s).

410_wp-login_php.thumb.jpg.86905f9ab46e4529163d4bc51d3df7e3.jpg

47ms is fast! :)

 

PS: 410 is better than 404, as I also use this for SearchEngineRequests that try to reach URLs that do not exist since 10 years or so. Normally the SEs should flush their cache on 410 returns.

Edited by horst
  • Like 2
  • Thanks 1

Share this post


Link to post
Share on other sites

In all honesty, I think that Jumplinks is better suited to site migrations. Black holes should either be covered by a specifically-built module, or by htaccess/vhost config...

  • Like 1
  • Thanks 1

Share this post


Link to post
Share on other sites

Ok guys, I get what you mean, so what about a module with this flow ?

  1. monitor and log HTTP error code for a period
  2. if an entry / request is superior of N then
  3. backup .htaccess file (versioning it)
  4. add new entries to the .htaccess file

 

Does it make sense or I should let the user manage their .htaccess file manually with a FAQ or something ?

  • Like 2

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By BitPoet
      MediaLibrary
      Update: MediaLibrary can now be found in the official module list.
      Out of necessity, I've started to implement a simple media library module.
      The basic mechanism is that it adds a MediaLibrary template with file and image fields. Pages of this type can be added anywhere in the page tree.
      The link and image pickers in CKEditor are extended to allow quick selection of library pages from dropdowns. In the link picker this happens in the MediaLibrary tab, where you can also see a preview of the selected image. In the image picker, simply select a library from the dropdown at the top, everything else is handled by standard functionality.
      I've put the code onto github. This module is compatible with ProcessWire 3.
      Steps to usage:
      Download the module's zip from github (switch to the pw3 branche beforehand if you want to test on PW 3.x) and unpack it into site/modules Click "Modules" -> "Refresh" in the admin Click "Install" for MediaLibrary For testing, create a page with the MediaLibrary template under home (give it an expressive title like 'Global Media') and add some images and files Edit a differnt page with a CKEditor field and add a link and an image to see the MediaLibrary features in action (see the screencap for details) Optionally, go into the module settings for MediaLibrary Note: this module is far from being as elaborate as Kongondo's Media Manager (and doesn't plan to be). If you need a feature-rich solution for integrated media management, give it a look.
      Feel free to change the settings for MediaFiles and MediaImages fields, just keep the type as multiple.
      There are some not-so-pretty hacks for creating and inserting the correct markup, which could probably be changed to use standard input fields, though I'm a bit at a loss right now how to get it to work. I've also still got to take a look at error handling before I can call it fit for production. All feedback and pointers are appreciated (that's also why I post this in the development section).

      Edit 09.03.2016 / version 0.0.4: there's now also a "Media" admin page with a shortcut to quickly add a new library.

      Edit 01.05.2016:
      Version 0.0.8:
      - The module now supports nested media libraries (all descendants of eligible media libraries are also selectable in link/image picker).
      - There's a MediaLibrary::getPageMediaLibraries method you can hook after to modify the array of available libraries.
      - You can switch between (default) select dropdowns or radio boxes in the module configuration of MediaLIbrary to choose libraries.
      Edit 10.10.2018:
      Version 0.1.3:
      - Dropped compatibility for ProcessWire legacy versions by adding namespaces
      - Allow deletion of libraries from the Media overview admin page
      - Added an option to hide media libraries from the page tree (optionally also for superusers)
    • By Robin S
      This module corrects a few things that I find awkward about the "Add New Template" workflow in the PW admin. I opened a wishlist topic a while back because it would good to resolve some of these things in the core, but this module is a stopgap for now.
      Originally I was going to share these as a few standalone hooks, but decided to bundle them together in a configurable module instead.
      Add Template Enhancements
      A module for ProcessWire CMS/CMF. Adds some efficiency enhancements when adding or cloning templates via admin.

      Features
      Derive label from name when new template added: if you like to give each of your templates a label then this feature can save some time. The label can be added automatically when templates are added in admin, in admin/API, or not at all. There are options for underscore/hyphen replacement and capitalisation of the label. Edit template after add: when adding only a single template, the template is automatically opened for editing after it is added. Copy field contexts when cloning: this copies the field contexts (a.k.a. overrides such as column width, label and description) from the source template to the new template when using the "Duplicate/clone this template?" feature on the Advanced tab. Copy field contexts when duplicating fields: this copies the field contexts if you select the "Duplicate fields used by another template" option when adding a new template. Usage
      Install the Add Template Enhancements module.
      Configure the module settings according to what suits you.
       
      https://github.com/Toutouwai/AddTemplateEnhancements
      https://modules.processwire.com/modules/add-template-enhancements/
    • By Mike Rockett
      As I mentioned in this issue, I've create a new textformatter for ParsedownExtraPlugin, which adds some oomph to your markdown.
      Repo: Parsedown Extra Plugin
      Unlike the built-in textformatter for Parsedown and Parsedown Extra, this should be used when you want to use Extra with additional configuration/customisation.
      Some examples:
      ### Test {.heading} - A [external link](https://google.com/){.google} with `google` as a class that opens in a new tab if the config property is set. - [Another link](/page){target=_blank} that opens in a new tab even though it isn't external. ```html .html <p>Test</p> ``` There's some config options available to you, such as setting attributes on all/external images and links, setting table and table-cell alignment classes, adjusting footnote classes and IDs, adding <code> attributes to their parent <pre> elements, and changing the <code> class if your syntax highlighter does not use language-*.
      I was thinking about adding the ability to make links open in a new tab by appending a plus to the link syntax, but only external links should be opening in a new tab anyway. Further, this would add extra, unnecessary processing time.
      Please let me know if you bump into any problems. ☺️
    • By Mike Rockett
      TextformatterTypographer (0.4.0 Beta)
      A ProcessWire wrapper for the awesome PHP Typography class, originally authored by KINGdesk LLC and enhanced by Peter Putzer in wp-Typography. Like Smartypants, it supercharges text fields with enhanced typography and typesetting, such as smart quotations, hyphenation in 59 languages, ellipses, copyright-, trade-, and service-marks, math symbols, and more.
      Learn more on my blog
      It's based on the PHP-Typography library found over at wp-Typography, which is more frequently updated and feature rich that its original by KINGdesk LLC.
      The module itself is fully configurable. I haven't done extensive testing, but there is nothing complex about this, and so I only envisage a typographical bug here and there, if any.
      Please do test it out and let me know what you think.
      Also note that I have indicated support for PW 2.8, but I haven't tested there as yet. This was built on PW 3.0.42/62.
    • By Mike Rockett
      Jumplinks for ProcessWire
      Release: 1.5.50
      Jumplinks is an enhanced version of the original ProcessRedirects by Antti Peisa.
      The Process module manages your permanent and temporary redirects (we'll call these "jumplinks" from now on, unless in reference to redirects from another module), useful for when you're migrating over to ProcessWire from another system/platform. Each jumplink supports wildcards, shortening the time needed to create them.
      Unlike similar modules for other platforms, wildcards in Jumplinks are much easier to work with, as Regular Expressions are not fully exposed. Instead, parameters wrapped in curly braces are used - these are described in the documentation.
      Under Development: 2.0, to be powered by FastRoute
      As of version 1.5.0, Jumplinks requires at least ProcessWire 2.6.1 to run.
      View on GitLab
      Download via the Modules Directory
      Read the docs
      Features
      The most prominent features include:
      Basic jumplinks (from one fixed route to another) Parameter-based wildcards with "Smart" equivalents Mapping Collections (for converting ID-based routes to their named-equivalents without the need to create multiple jumplinks) Destination Selectors (for finding and redirecting to pages containing legacy location information) Timed Activation (activate and/or deactivate jumplinks at specific times) 404-Monitor (for creating jumplinks based on 404 hits) Additionally, the following features may come in handy:
      Stale jumplink management Legacy domain support for slow migrations An importer (from CSV or ProcessRedirects) Feedback & Feature Requests
      I’d love to know what you think of this module. Please provide some feedback on the module as a whole, or even regarding smaller things that make it whole. Also, please feel free to submit feature requests and their use-cases.
      Note: Features requested so far have been added to the to-do list, and will be added to 2.0, and not the current dev/master branches.
      Open Source

      Jumplinks is an open-source project, and is free to use. In fact, Jumplinks will always be open-source, and will always remain free to use. Forever. If you would like to support the development of Jumplinks, please consider making a small donation via PayPal.
      Enjoy!