Blog comment spam taken to the next level? 24

Posted by Jonas Elfström Wed, 19 May 2010 21:01:00 GMT

Only days after starting this barely visited blog the comment spammers showed up. I activated Akismet and it solved my problem for a while. Over time the spamming got worse and although almost no spam actually showed up on the blog, my fear of false positives still forced me to remove the spam manually. Not my idea of fun.

I then activated a very simple hurdle. I demanded JavaScript to be able to post comments. Well, not really, what Typo really did was to expect the HTTP header X-Requested-With: XMLHttpRequest. That worked flawlessly for about three years. Just six weeks ago I noticed the first spambots including the needed header to pass through. Almost all posted stupid drug ads that Akismet easily identified as spam. The situation was still under control.

Yesterday something completely new happened. I got a comment on an old post about my admiration of Ira Glass and This American Life. The comment seemed believable enough and it passed the spam filter. Still, the link at the end undeniably identified it as spam. Then there was another, and another, and up until now a total of six spam comments in the new and more advanced format. I'm still not sure if these are scripted but it seems an awful lot of work if they are actually manually typed.

And also these on a post about infinite ranges in C#:

I'm kind of bothered with the ones both referring to content of the post and at the same time mentioning that it's not about programming like most of my posts are.

Are these human or machine made? A clever combination? Also, if you happen to be a spambot and actually answer this then I guess I will have to congratulate you for passing the Turing test.

PS. They spammed from 110.0.0.0 - 110.255.255.255 so if you happen to have problems with the same spammers and aren't worried about blocking the Philippines then you know what to do. DS.

JavaScript hash table keys 2

Posted by Jonas Elfström Fri, 05 Mar 2010 16:42:00 GMT

In JavaScript you can add properties to objects dynamically. You can access those properties both by object.foo and object['foo']. The later is commonly used to use JavaScript objects as hash tables (associative arrays).

While implementing a simplistic unique random number generator I happened to use keys(obj). Unfortunately keys(obj) is part of ECMAScript 5. See chapter 15.2.3.14 in ECMA-262. The web browsers of today mostly implements ECMAScript 3.

Here's an implementation of keys(obj) for ECMAScript 3 browsers (tested in Google Chrome, IE8 and Firefox 3.5). If the browser already has a keys function then nothing will be done.

1
2
3
4
5
6
7
8
9
if (typeof keys == "undefined") 
{ 
  var keys = function(ob) 
  {
    props=[];
    for (k in ob) if (ob.hasOwnProperty(k)) props.push(k);
    return props;
  }
}


The simplistic unique random number generator looks like this

1
2
3
4
5
6
function uniqueRndNumbers(min, max, quantity) {
  var ht={}, i=quantity;
  while ( i>0 || keys(ht).length<quantity) 
    ht[Math.floor(Math.random()*(max-min+1))+min]=i--;
  return keys(ht);
}


This function has not undergone any serious testing. Also if the quantity is more than a fraction of (max-min) then another algorithm like the Fisher–Yates shuffle might be a better choice.

Ira Glass on Storytelling 7

Posted by Jonas Elfström Wed, 03 Feb 2010 22:48:00 GMT

When a new episode of This American Life is made available, that is the first thing I listen to. There are a lot of great podcasts out there but if I had to choose only one, I think I would go for This American Life (sorry RadioLab, I love you). Recently I learned that the host and producer of the show, Ira Glass, can be found on YouTube talking about storytelling. He covers how he thinks you should tell a story for a radio/TV show. I recognize how the show executes that narrative but it also doesn't take anything away from the fact it's excellent and that there are a lot of hard work and talent put into it.

Ira Glass on Storytelling #1

Ira Glass on Storytelling #2

Ira Glass on Storytelling #3

Ira Glass on Storytelling #4

C# implicit string conversion 3

Posted by Jonas Elfström Wed, 03 Feb 2010 16:32:00 GMT

I know how it works and I think I can see why but I'm still not very fond of how eager C# is to perform implicit string conversion.

Contrived example:

1
2
string s = -42 + '+' + "+" + -0.1 / -0.1 + "=" + (7 ^ 5) + 
      " is " + true + " and not " + AddressFamily.Unknown;

s will be set to "1+1=2 is True and not Unknown"

The answer is in white text above, select the text to see it.

A more real problem is something like this


string str = 1 + 2 + "!=" + 1 + 2;

str will be set to "3!=12".

Edit 2010-02-08
This wouldn't be much of a problem if all objects in .NET always returned a decent string representation of their current state/value with ToString() but that's not the case. Instead "The default implementation returns the fully qualified name of the type of the Object.".
I don't like the inconsistency. It's way too late now but I think it would have been much better if only objects that really produces a human readable output of the data in the object should implement ToString(). If you want the name of the type of the Object there should be another way.

String concatenation in Ruby 8

Posted by Jonas Elfström Mon, 01 Feb 2010 23:04:00 GMT

There's no StringBuilder class in Ruby because the String class has the << for appending. The problem is that not every Ruby programmer seems to be aware of it. Recently I've seen += being used to append to strings where << would have been a much better choice.

The problem with using += is that it creates a new String instance and if you do that in a loop you can get really horrible performance.

If you are dealing with an array you don't even have to use << because Array#join is even faster and shows intent in a nice way.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
require 'benchmark'

array_of_rnd_strings=(0...262144).map{65.+(rand(25)).chr}
                                 .join.scan(/.{1,8}/m)

Benchmark.bm do |benchmark|
  benchmark.report do
    str=array_of_rnd_strings.join
  end
  benchmark.report do
    str2=""
    array_of_rnd_strings.each do |s|
      str2<<s
    end
  end
  benchmark.report do
    str3=""
    array_of_rnd_strings.each do |s|
      str3+=s
    end
  end
end
The array_of_rnd_strings is an array of 32768 8 characters long random strings.

usersystemtotalreal
0.0300000.0000000.030000( 0.027184)
0.1600000.0100000.170000( 0.190277)
106.0200000.300000106.320000(113.457793)


The performance of += was even worse than I imagined!

Older posts: 1 2 3 4 5 ... 12