Blog comment spam taken to the next level? 24
Only days after starting this barely visited blog the comment spammers showed up. I activated Akismet and it solved my problem for a while. Over time the spamming got worse and although almost no spam actually showed up on the blog, my fear of false positives still forced me to remove the spam manually. Not my idea of fun.
I then activated a very simple hurdle. I demanded JavaScript to be able to post comments. Well, not really, what Typo really did was to expect the HTTP header X-Requested-With: XMLHttpRequest. That worked flawlessly for about three years. Just six weeks ago I noticed the first spambots including the needed header to pass through. Almost all posted stupid drug ads that Akismet easily identified as spam. The situation was still under control.
Yesterday something completely new happened. I got a comment on an old post about my admiration of Ira Glass and This American Life. The comment seemed believable enough and it passed the spam filter. Still, the link at the end undeniably identified it as spam. Then there was another, and another, and up until now a total of six spam comments in the new and more advanced format. I'm still not sure if these are scripted but it seems an awful lot of work if they are actually manually typed.
And also these on a post about infinite ranges in C#:
I'm kind of bothered with the ones both referring to content of the post and at the same time mentioning that it's not about programming like most of my posts are.
Are these human or machine made? A clever combination? Also, if you happen to be a spambot and actually answer this then I guess I will have to congratulate you for passing the Turing test.
PS. They spammed from 110.0.0.0 - 110.255.255.255 so if you happen to have problems with the same spammers and aren't worried about blocking the Philippines then you know what to do. DS.
Ira Glass on Storytelling 7
When a new episode of This American Life is made available, that is the first thing I listen to. There are a lot of great podcasts out there but if I had to choose only one, I think I would go for This American Life (sorry RadioLab, I love you). Recently I learned that the host and producer of the show, Ira Glass, can be found on YouTube talking about storytelling. He covers how he thinks you should tell a story for a radio/TV show. I recognize how the show executes that narrative but it also doesn't take anything away from the fact it's excellent and that there are a lot of hard work and talent put into it.
Counting the number of Google Readers 2
I run this blog on a 9 year old laptop hidden in a cabinet in the living room. It's not a powerful machine but it has been up to the job since I turned it into a web server 7 years ago. This could maybe be one of the last HP Omnibook 4150b still in use, at least it has to be in a very exclusive club of laptops being switched on for the past 7.5 years. Recently I've seen an increase in traffic and especially from Feedfetcher-Google. It so happens that Feedfetcher also shows the number of subscribers.
[19/Oct/2009:22:01:19 +0200] "GET /xml/rss20/feed.xml HTTP/1.1" 304 0 "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 4 subscribers; feed-id=7686756599804593322)"
The above is only one out of five different feed-ids because I have both atom and rss and for a short while this blog was at another address. The fifth feed is actually myself subscribing to the comments.
I'm not using FeedBurner so I can't get my statistics from there but I still wanted to be able to see the number of Google Readers of my blog (as far as I can see I only have one other type of subscriber).
Usually I script anything more advanced than a grep in Ruby but this time I made an exception and stayed in Bash.
1 2 3 4 5 6 7 8 9 |
tail -1000 /www/logs/access.log | grep Feedfetcher | cut -d ";" -f 4 | sort -u | while IFS= read -r line do tac /www/logs/access.log | grep -m 1 $line done | sed 's/^.*html; \([0-9]*\) subscribers.*/\1/' | awk '{tot=tot+$1} END {print tot}' |
Most certainly this can be optimized in a number of ways. Don't be shy, just tell me!
So what's going on there? Well, first I get the last 1000 rows from my access log and right now my traffic is so low that that is way more than I really would have to. Then I get all unique feeed-ids from the rows containing Feedfetcher. I pipe those to a loop that gets the very last access for each one of them. Then I parse out the number of subscribers with a regexp in sed and count them with awk .
It turns out that I have a whopping number of 14 15 subscribers and I am one of them.
Death from lack of content 1
I'm sorry to admit that this blog has died from lack of content and I have absolutely no guarantees to give you that it will ever come alive again.
At least I'm still alive and last night I had some fun with C, Code::Blocks and SDL.
Lightning crashes
Three weeks ago lightning struck nearby. Today my ISP finally tried to change the switch in the central even though I reported back to them that my VSDL modem worked just fine at a friend’s house only a couple of days after my connection died.
Older posts: 1 2
