Patch & module to stop spambots in Drupal 7

Submitted by Nicola Rainiero on 2013-03-18 (last updated on 2013-11-07)

Recently I have been testing a patch for the captcha module and a custom module to ban temporarly the spambots that usually attack my Drupal 7 website. The results could be promising. Here is my provisional solution.

In the previous article, Strategies adopted to stop spam in D7 (1st), I described the modules that I use to stop spam. Unfortunately they don't solve completely the problem. In fact after checking the Top pages in the past 1 day (http://my_site/admin/reports/pages), I usually find these pages:

Top pages in the past 1 day (before the patch)
Top pages in the past 1 day (before the patch)

How many commentators! So I often have to understand whose IP is the responsible for, and than ban it manually. Because every time that my site reloads the page after an incorrect captcha, a little piece of my monthly bandwidth goes away! Above all I don't bear this nasty behaviour.

Patch for the captcha module

I found this fantastic thread in the CAPTCHA » Issues and although the author, Nightwalker3000, didn't consider it a decisive solution, because "It requires that the SPAMMER always using the same csid , but it seems like that there tools refresh the page after each try, so there get a new csid and then this patch doesn't work", my test reveals that at the moment is suitable for me.

Simply I put directly the patch (obviously I have removed some characters and notes) into the captcha.module file from the 629 line, after this:

          ),
WATCHDOG_NOTICE);
}

There is the code (the original patch is here):

//
// START new additions
//
// Retrieve the number of attempts and ip address for this session
$attempts_and_ip = db_query(
'SELECT attempts,ip_address FROM {captcha_sessions} WHERE csid = :csid',
array(':csid' => $csid)
)
->fetchAssoc();
// TODO - Make this configurable
$max_attempts=5;
// Ban IP if it has enter the Wrong Captcha for $max_attempts Times
if ($attempts_and_ip['attempts']>=$max_attempts) {
db_insert('blocked_ips')
->fields(array('ip' => $attempts_and_ip['ip_address'])) //188.165.240.76
->execute();
// log to watchdog
watchdog('CAPTCHA',
t('IP Adress %ip_address has been Blocked by CAPTCHA. Because of exceeding the Max Wrong Captcha Input of %maxattempts times'),
array('%ip_address'=>$attempts_and_ip['ip_address'],
'%maxattempts'=>$max_attempts));
form_set_error('captcha_response', t('Your IP has been Blocked. Please don\'t SPAM!'));
}
//
// END new additions
//

As you can see after 5 attempts the bastard IP is banned and the only test that it can see is "Sorry, XXX.XXX.XXX.XXX has been banned.".

All the same the banning is forever and I can't know if it is a spambot or a human, who maybe hasn't read well my captcha image (now it has made easier). For this reason I have created a custom module that it clears the ip tables after a cron execution.

Clear iptable module

I adopted this example and my final floodclear.module is the following:

<?php
// $Id$

/**
* @file
* Experimenting with drupal's flooding mechanism.
*/
function floodclear_cron() {
// Default to an hourly interval. Of course, cron has to be running at least
// hourly for this to work.
$interval = variable_get('floodclear_interval', 60*60);
// We usually don't want to act every time cron runs (which could be every
// minute) so keep a time for the next run in a variable.

if (time() >= variable_get('floodclear_next_execution', 0)) {
// This is a silly example of a cron job.
// This clear your blocked ips
db_query("DELETE FROM `blocked_ips`");
//
watchdog('floodclear', 'floodclear ran');
if (!empty($GLOBALS['floodclear_show_status_message'])) {
drupal_set_message(t('floodclear executed at %time', array('%time' => date_iso8601(time(0)))));
}
variable_set('floodclear_next_execution', time() + $interval);
}
}

UPDATE (2013-11-07): If you want to keep some disturbing IP addresses after deleting the "blocked_ips" table (for example "XXX.XXX.XXX.XXX" and "YYY.YYY.YYY.YYY"), you can edit floodclear.module in this manner:

<?php
// $Id$

/**
 * @file
 * Experimenting with drupal's flooding mechanism.
 */
function floodclear_cron() {
  // Default to an hourly interval. Of course, cron has to be running at least
  // hourly for this to work.
  $interval = variable_get('floodclear_interval', 60*60);
  // We usually don't want to act every time cron runs (which could be every
  // minute) so keep a time for the next run in a variable.

  if (time() >= variable_get('floodclear_next_execution', 0)) {
    // This is a silly example of a cron job.
    // This clear your blocked ips
    db_query("DELETE FROM `blocked_ips`");
    // Start add multiple disturbing ips
    $values = array(
      array(
        'ip' => 'XXX.XXX.XXX.XXX',
      ),
      array(
        'ip' => 'YYY.YYY.YYY.YYY',
      ),
    );
    $query = db_insert('blocked_ips')->fields(array('ip',));
       foreach ($values as $record) {
       $query->values($record);
    }
    $query->execute();
    // Finish add multiple disturbing ips
    watchdog('floodclear', 'floodclear ran');
    if (!empty($GLOBALS['floodclear_show_status_message'])) {
      drupal_set_message(t('floodclear executed at %time', array('%time' => date_iso8601(time(0)))));
    }
    variable_set('floodclear_next_execution', time() + $interval);
  }
}

The complete module is in the zip file: floodclear.zip

I set the cron time in http://my_site/admin/config/system/cron page and now it executes every 12 hours. There is no doubt that it is a provisional solution because the war against the spam is a lost battle, however now my Top pages in the past 1 day and Top visitors in the past 1 day are:

 Top pages in the past 1 day (after the patch) Top visitors in the past 1 day (after the patch)
Top pages in the past 1 day (after the patch)

Top visitors in the past 1 day (after the patch)

The first two visitors are googlebots

In the future if I have too spam, I will try another sofisticated module, similar to the CAPTCHA After one, but working in a different way. In a few words after X wrong captcha attempts, the comment forms disappear and the spambot won't be able to submit its spam. Today I have just the name: after_captcha (you know me, I have a lot of creativity and imagination!).




Related Content:

Nicola Rainiero

A civil geotechnical engineer with the ambition to facilitate own work with free software for a knowledge and collective sharing. Also, I deal with green energy and in particular shallow geothermal energy. I have always been involved in web design and 3D modelling.

Add new comment

The content of this field is kept private and will not be shown publicly.

Plain text

  • No HTML tags allowed.
  • Web page addresses and email addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.