New kind of referer spam?
While looking my ShortStat stats today, I found a big amount of request originating from a set of adult entertainment sites, at least based on the name. I doubt that my site is linked that extensively on those sites, especially as the page they are refering to is ‘_’. Yes, an underscore.
This smells like yet another feeble attempt to generate referal spam. How desperate are those people in adult entertainment or does this kind of stuff really boost your sales by adding your pagerank in Google?
A friend of mine, professor Hannu Kari has predicted that Internet, as we know it, will has its end in two years. Every time I see these exploitations or get viruses, hoaxes and spam to my inbox, I tend to agree more and more with him. Sad.
1. Andrew Donaldson — Thursday, Nov 4 2004
It’s annoying to say the least, as you’ve always got to be ‘extra careful’ how thoroughly you check out your webstats.
Never mind, it’ll be something worse next week ;)
2. Stuart — Friday, Nov 5 2004
I think a few people have been getting these just lately Janne. I had them myself about a month ago and have now banned the IPs from the site.
3. Janne — Friday, Nov 5 2004
I found it deeply amusing to read the names of the websites, as someone has been quite imaginative while figuring out the names…
As I don’t show any referral information on any publicly accessible page (a page that search engines would pick up), I don’t really care – unless they start to hog bandwidth. Current attempts just raise the number of page counts, a number that is asked in several affiliate agreements…
Speaking of bandwidth hogging, I do hope that this site is not hit by offline browsers. Have you had any experiences with them?
4. Stuart — Friday, Nov 5 2004
You would have to tell me what an “offline browser” is first. I get quite a few “bots” registering that I don’t recognise as “crawler bots”. They wouldn’t have anything to do with those peculiar “language codes” showing up in ShortStat that I asked you about a couple of weeks ago?
5. Janne — Friday, Nov 5 2004
Offline browser is a program that sucks the whole content of your website to a local disk, so that the user can read the pages after disconnecting from the network.
It’s kind of a spider/robot, but they behave worse than most search engines and they are easily fooled with pages that can be reached with different URIs.
6. Stuart — Friday, Nov 5 2004
Well as far as I can see I haven’t had one of those – yet. (wish more WP users would enable Textile! ) Would they be able to do that with my site as I’m on TXP and everything is in the database?
7. Janne — Saturday, Nov 6 2004
Stuart, in short – yes, they can do it. As your site has several URLs to fetch the same page, for example, by the article ID and the date, you’d suffer from offline browsing. The database doesn’t help here, because it’s just in the background, and the offline browsers suck pages through your web interface without understanding anything about the semantic structures.
8. Stuart — Saturday, Nov 6 2004
I thought you might say that. They are more clever than mere page readers then. In that case I’ve been lucky so far.
9. Stuart — Saturday, Nov 6 2004
Just caught another of those porn site referers!!
10. Janne — Monday, Nov 8 2004
I got a load of spam again this weekend. Cigarettes and online poker. Two of them slipped through my spam words. I had to update the spam word listing three times…
This is getting to my nerves.
11. Undercrank — Tuesday, Jan 25 2005
Despamming Shortstat
I’ve been using Shaun Inman’s Shortstat package for a short while now as my main source of web statistics. However, it’s fairly susceptible to the, er, ‘innovation’ known as referer spam – so here’s some code that use’s Jay Allen’s MT-Blacklis…