The Germans are coming...

Jul 21 2004

My hits just went through the roof (relatively speaking), and it’s all down to a couple of hours of feverish activity by discovery.informatik.RWTH-Aachen.DE on IP address 137.226.59.32.

Whuh?

Looking at the speed it was moving through my links, it could only be a bot, so I did a little digging – www.rwth-aachen.de seems to be a Dutch university as far as I can work out, so I guess that www.informatik.rwth-aachen.de is the Technical Department (they do have a little British flag to click on, but all it seems to do is translate the navigation into English, not the page content), and this discovery thing is some Dutch student’s degree project. The weird thing is that it hit the site directly, with no referer.

A reverse DNS lookup revealed the origin of the IP address to be the Aachen University of Technology; so German rather than Dutch (wonder why they have a Dutch flag on the site).

I don’t know whether to bother blocking it in .htaccess – I suppose it’s not really doing any harm, just making my site stats seem a little one-sided (56% of visitors are search crawlers as of right now).

Of course, it could be simply using the university’s name as a cover and be harvesting email addresses (of which there are exactly zero on this site), and I suppose it is sucking bandwidth.

To ban the robots by IP address, I’ll simply add the following lines to my .htaccess file:

Options +FollowSymLinks
Rewrite Engine On
RewriteBase /
RewriteCond %{REMOTE_ADDR} ^137.226.59..*
RewriteRule .* - [F,L]

Leaving the fourth element of the IP blank will block all access from any in that IP range, in case the bot can use any other related IPs.

Filed under: The Site.

Digg this article

Bookmark this article with del.icio.us

Previously: Why you should always enter competitions

Next: Metablogging


Comments

Mike P.
2858 days ago
Hey, try hitting our site, say 6-7 pages quickly.

If yer inetersted I could pass you a little throttle (e-mail me) that has helped us reduce damage from bandwidth suckers ;-]
#1
Matthew Pennell
2858 days ago
Easy does it.
You seem to be hitting too many pages too fast. Slow down and have a read through one of them, and we’ll let you back in.


Excellent!
#2