Wednesday, February 9, 2011
magpie-crawler/1.1 (Brandwatch)
I was idly watching some Apache access logs scroll by (well actually I was busy doing something, but like to keep an eye on things to spot any interesting or worrying trends early) and noticed a bunch of entries like this:
94.228.34.238 - - [06/Feb/2011:06:21:23 +0100] "GET /blog/?o=10 HTTP/1.1" 301 290 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
94.228.34.238 - - [006/Feb/2011:06:21:33 +0100] "GET /blog/?o=40 HTTP/1.1" 301 290 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
Never noticed that UA before, and it never seemed to follow up on the (perfectly valid) 301 redirects. A scan through 3 months of logs shows it's been doing that all the time - what a dumb bot.
Checking the URL provided, this appears to be the home page of a UK-based company providing "Social Media Monitoring Tools" - and who don't have the courtesy to provide any more information about their bot / crawler. Which is evidently not popular in some quarters.
So, as "Brandwatch" provides neither myself not the sites I run with any conceivable benefit, it's on the blocklist they go.
(I wonder if they monitor their own brand?)
Posted at 5:31 AM |Comments (0)