Finding primes in parallel 18

Posted by Jonas Elfström Thu, 14 Jan 2010 21:55:00 GMT

Justin Etheredge has been blogging about his challenge to find prime numbers with LINQ. He later used AsParallel() (coming in .NET 4) to speed things up and then followed that up with a post about using The Sieve Of Eratosthenes.

As you can see in the comments of those posts I tried to speed the Sieve of Eratosthenes up by using Parallel.For in the inner loop. I also tried AsParallel() in the LINQ expression but it made no difference in either case. At most it got 5% faster. I'm not sure but it could be that because SoE is very memory intense we could have a scaling issue and maybe also memory bandwidth exhaustion. This is mere speculation.

I then searched for other algorithms and found The Sieve of Atkin. It uses less memory than SoE so I thought I'd give it a try.

I set the limit to 20,000,000 and then benchmarked it. It timed in on 2.48s so actually worse than the 2.2s that SoE took. Not good! Then I added Parallel.For in the loop that did most of the work and lo and behold, it scaled! I have two cores in my machine (T7200@2.0GHz) and the average runtime went down to 1.26s. That's almost linear and surprisingly good! If you happen have a quad core (or more) and feel like trying it out then please contact me. It would be interesting to see if it scales further.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
static List<int> FindPrimesBySieveOfAtkins(int max)
{
   //  var isPrime = new BitArray((int)max+1, false); 
   //  Can't use BitArray because of threading issues.

    var isPrime = new bool[max + 1];
    var sqrt = (int)Math.Sqrt(max);

    Parallel.For(1, sqrt, x =>
    {
        var xx = x * x;
        for (int y = 1; y <= sqrt; y++)
        {
            var yy = y * y;
            var n = 4 * xx + yy;
            if (n <= max && (n % 12 == 1 || n % 12 == 5))
                isPrime[n] ^= true;

            n = 3 * xx + yy;
            if (n <= max && n % 12 == 7)
                isPrime[n] ^= true;

            n = 3 * xx - yy;
            if (x > y && n <= max && n % 12 == 11)
                isPrime[n] ^= true;
        }
    });

    var primes = new List<int>() { 2, 3 };
    for (int n = 5; n <= sqrt; n++)
    {
        if (isPrime[n])
        {
            primes.Add(n);
            int nn = n * n;
            for (int k = nn; k <= max; k += nn)
                isPrime[k] = false;
        }
    }

    for (int n = sqrt + 1; n <= max; n++)
        if (isPrime[n])
            primes.Add(n);

    return primes;
}

This code needs C# 4.0 to compile.

Edit 2010-12-14

Dommer found out that the BitArray implementation had some serious threading issues. I had my worries about the non thread safe characteristics of BitArray but I thought that the isPrime[n] ^= true; was an atomic operation and that it didn't matter in what order bit bits was flipped would make it possible to use anyway. Not so. Changed it to a boolean array and that seems to rock the boat but of course at a much higher memory cost.

Edit 2010-01-20

Indications are that this does in fact not scale very good on a quad core. It's even worse, it seems it scales good on my old T7200 but not on a dual core E6320. I don't know why but of course the shared state of the isPrime BitArray is a huge problem and maybe it could be that differences in CPU architecture (FSB speed, caches and so on) in the E6320 is an explanation. Average execution time on the E6320 was 1290ms in a single thread and 1064ms in two.

If you want to try this in an older version of C# than 4.0 then check out this post.

A reader asked how I timed the executions. Here's how.

1
2
3
4
5
6
7
8
9
10
11
12
13
var steps = new List<long>();
var watch = new Stopwatch();

for (int i = 0; i < 10; i++) 
{
    watch.Reset();
    watch.Start();
    var primes = FindPrimesBySieveOfAtkins(20000000);
    watch.Stop();
    Console.WriteLine(watch.ElapsedMilliseconds.ToString());
    steps.Add(watch.ElapsedMilliseconds);
}
Console.WriteLine("Average: " + steps.Average().ToString());



Edit 2010-10-24

Tom's code from the comment below

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
using System; 
using System.Collections.Generic; 
using System.Linq; 
using System.Numerics; 
using System.Text; 
using System.Threading.Tasks;

namespace Calculate_Primes 
{ 
    class Program 
    { 
        private const int _NUMBER_OF_DIGITS = 100;

        static void Main(string[] args)
        {
            BigInteger floor = BigInteger.Parse("1" + string.Empty.PadLeft(_NUMBER_OF_DIGITS - 1, '0'));
            BigInteger ceiling = BigInteger.Parse(string.Empty.PadLeft(_NUMBER_OF_DIGITS, '9'));

            Console.WindowWidth = 150;

            //var primes = Enumerable.Range(floor, ceiling).Where(n => Enumerable.Range(1, n).Where(m => (n / m) * m == n).Count() == 2);

            Console.Clear();
            _calculatePrimes(floor, ceiling, "C:\\100 digit primes.txt");

            Console.Clear();
            _calculatePrimes(floor, ceiling, "C:\\300 digit primes.txt");
        }

        static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toInclusive)
        {
            for (BigInteger i = fromInclusive; i <= toInclusive; i++)
                yield return i;
        }

        static void ParallelFor(BigInteger fromInclusive, BigInteger toInclusive, Action<BigInteger> body)
        {
            Parallel.ForEach(Range(fromInclusive, toInclusive), body);
        } 

        static void _calculatePrimes(BigInteger floor, BigInteger ceiling, string resultsFileFilepath)
        {
            using (System.IO.FileStream fs = new System.IO.FileStream(resultsFileFilepath, System.IO.FileMode.Create)) { }

            using (System.IO.StreamWriter sw = new System.IO.StreamWriter(resultsFileFilepath))
            {
                ParallelFor(floor, ceiling, i =>
                    {
                        if (_isPrime(i))
                        {
                            lock (sw)
                            {
                                sw.Write(i.ToString() + System.Environment.NewLine);
                                sw.Flush();
                            }
                        }
                    });
            }
        }

        static bool _isPrime(BigInteger number)
        {
            bool returnValue = true;

            Console.WriteLine("Checking {0} for primality.", number.ToString());

            if ((number < 2) || (number > 2 && number.IsEven) || (number > 2 && number.IsPowerOfTwo))
                returnValue = false;
            else
                for (BigInteger i = 2; i * i <= number; i++)
                {
                    if (number % i == 0)
                        returnValue = false;
                }

            if(returnValue)
                Console.WriteLine("         {0} IS prime.", number.ToString());
            else
                Console.WriteLine("         {0} IS NOT prime.", number.ToString());

            return returnValue;
        }
    }
}

Comparing instance variables in Ruby

Posted by Jonas Elfström Mon, 02 Nov 2009 21:50:00 GMT

Say you have two objects of the same class and you want to know what differs between them. Well actually you just want to know the instance variables in object b that differs from the ones in object a.

To begin with, we need a class. I like cheese.

1
2
3
4
5
6
class Cheese
  attr_accessor :name, :weight, :expire_date
  def initialize(name, weight, expire_date)
    @name, @weight, @expire_date = name, weight, expire_date
  end
end


Then we need some cheese objects.

1
2
stilton=Cheese.new('Stilton', 250, Date.parse("2009-11-02"))
gorgonzola=Cheese.new('Gorgonzola', 250, Date.parse("2009-11-17"))


With only name, weight and an expiration date it would be easy to compare those but imagine that these two objects has 42 properties. It does not stop there, you are being asked to compare 24 different classes in this way. Are you cringing yet?

Object#instance_variables to the rescue! Well, that and a small hack by me. Below I add a new method called instance_variables_compare to Object. The long method name is because I wanted to follow the naming already in place. Usually I prefer to do these kind of things as a module and then include them where appropriate but in this case I find that a monkey patch will do.

1
2
3
4
5
6
7
class Object
  def instance_variables_compare(o)
    Hash[*self.instance_variables.map {|v|
      self.instance_variable_get(v)!=o.instance_variable_get(v) ? 
      [v,o.instance_variable_get(v)] : []}.flatten]
  end
end


It returns the instance variables that differs as a hash because it's handy and because I like it that way.

1
2
3
4
5
6
7
8
9
10
>> stilton.instance_variables_compare(gorgonzola)
=> {"@name"=>"Gorgonzola", "@expire_date"=>#<Date: 4910305/2,0,2299161>}
>> gorgonzola.instance_variables_compare(stilton)
=> {"@name"=>"Stilton", "@expire_date"=>#<Date: 4910275/2,0,2299161>}
>> stilton.expire_date=gorgonzola.expire_date
=> #<Date: 4910305/2,0,2299161>
>> stilton.instance_variables_compare(gorgonzola)
=> {"@name"=>"Gorgonzola"}
>> stilton.instance_variables_compare(stilton)
=> {}


If you ever think of using this code you should be aware of two things.

  1. This code is very untested and comes with no guarantees.
  2. Since instance variables spring into life the first time they are assigned to you either have to work with objects that always initialize everything or you have to change instance_variables_compare to handle this.

Infinite ranges in C# 2

Posted by Jonas Elfström Tue, 20 Oct 2009 18:41:00 GMT

I recently learned that C# is compliant with the IEEE 754-1985 for floating point arithmetics. That wasn't a big surprise but that division by zero is defined as Infinity in it was! It actually kind of bothers me that I didn't know this.

In mathematics division by zero is undefined for real numbers but I guess Infinity is a more pragmatic result. Or as a friend put it "IEEE stands for Institute of Electrical and Electronics Engineers not Institute of Mathematics"

1
2
3
4
double n = 1.0;
n = n / 0;
if (n > 636413622384679305)
  System.Console.WriteLine("Yes it certainly is!");


This C# code does not throw an exception, it simply leaves n defined as Infinity and writes a line to the console.

Ruby is also IEEE 754-1985 compliant. It even lets you define infinite ranges.

1
2
3
4
5
6
Infinity=1.0/0
=>Infinity
(1..Infinity).include?(162259276829213363391578010288127)
=> true
(7..Infinity).step(7).take(3).inject(&:+) # 7+14+21
=> 42


I can't say I see very much use of this but it brings a kind of completeness to the handling of infinities. Unfortunately it seems we don't get that in C# out of the box because Enumerable.Range takes <int>,<int> as parameters and there's no Infinity definition for int. That's unless someone wrote a generic Range class. Turns out none other than Jon Skeet did in his MiscUtil. Download MiscUtil and then by using MiscUtil.Collections; you can:

1
2
3
4
5
6
double n = 1.0;
var infinity = n / 0;
var r = new Range<double>(0, infinity);
if (r.Contains(4711))
  System.Console.WriteLine("Yes it certainly does!");
var sum = r.Step(7.0).Take(3).Sum();


And guess what, it works like a charm! 4711 is part of positive infinity and sum is 42.0 and all is good.

Edit

There's also a couple of predefined constants. Thanks to Eric for pointing that out.

1
2
var r = new Range<double>(7,  System.Double.PositiveInfinity);
var sum = r.Step(7.0).Take(3).Sum();


Counting the number of Google Readers 2

Posted by Jonas Elfström Mon, 19 Oct 2009 19:55:00 GMT

I run this blog on a 9 year old laptop hidden in a cabinet in the living room. It's not a powerful machine but it has been up to the job since I turned it into a web server 7 years ago. This could maybe be one of the last HP Omnibook 4150b still in use, at least it has to be in a very exclusive club of laptops being switched on for the past 7.5 years. Recently I've seen an increase in traffic and especially from Feedfetcher-Google. It so happens that Feedfetcher also shows the number of subscribers.

[19/Oct/2009:22:01:19 +0200] "GET /xml/rss20/feed.xml HTTP/1.1" 304 0 "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 4 subscribers; feed-id=7686756599804593322)"

The above is only one out of five different feed-ids because I have both atom and rss and for a short while this blog was at another address. The fifth feed is actually myself subscribing to the comments.

I'm not using FeedBurner so I can't get my statistics from there but I still wanted to be able to see the number of Google Readers of my blog (as far as I can see I only have one other type of subscriber).

Usually I script anything more advanced than a grep in Ruby but this time I made an exception and stayed in Bash.

1
2
3
4
5
6
7
8
9
tail -1000 /www/logs/access.log |
grep Feedfetcher |
cut -d ";" -f 4 | sort -u |
while IFS= read -r line
do
   tac /www/logs/access.log | grep -m 1 $line
done |
sed 's/^.*html; \([0-9]*\) subscribers.*/\1/' |
awk '{tot=tot+$1} END {print tot}'

Most certainly this can be optimized in a number of ways. Don't be shy, just tell me!

So what's going on there? Well, first I get the last 1000 rows from my access log and right now my traffic is so low that that is way more than I really would have to. Then I get all unique feeed-ids from the rows containing Feedfetcher. I pipe those to a loop that gets the very last access for each one of them. Then I parse out the number of subscribers with a regexp in sed and count them with awk .

It turns out that I have a whopping number of 14 15 subscribers and I am one of them.

The Thrush combinator in C# 1

Posted by Jonas Elfström Tue, 06 Oct 2009 18:35:00 GMT

Last year I read Reg "Raganwald'" Braithwaite's excellent post The Thrush and he explains it as

The thrush is written Txy = yx. It reverses evaluation.

Back then I didn't even consider trying to implement it in C#. That was before I digged deeper into lambda expressions and extension methods in C# 3.0 and way before last night when I read Debasish Ghosh's post on how to implement the Thrush in Scala. After reading that my first thought was if it was possible to do the same in C#. Here's my attempt.

At first I struggled with the static typing and headed for an easy way out using Object in the extension method of Object:

1
2
3
public static object Into(this Object obj, 
                        Func<object, object> f)
{  return f.Invoke(obj); }


My goal was to translate the Ruby example


(1..100).select(&:odd?).inject(&:+).into { |x| x * x }


in Raganwald's post to C#.

Which reads "Take the numbers from 1 to 100, keep the odd ones, take the sum of those, and then answer the square of that number."

But with the Object based extension method I had to do some ugly casts.


var r = Enumerable.Range(1, 100).Where(x => Odd(x)).Sum().Into(x => (int)x * (int)x);


With som added typing I could do:


var result = Enumerable.Range(1, 100).Where(x => Odd(x)).Sum().Into(x => x * x);


That merely moved the cast to the extension method and also made it work for integers only.

1
2
public static int Into(this Object obj, Func<int, int> f)
{ return f.Invoke((int)obj); }


Then I remembered generics and method type inference which finally led to a decent Thrush combinator in C#.

1
2
public static T Into<T>(this T obj, Func<T, T> f)
{ return f(obj); }


The casts are gone and it's also, as far as I can see, as flexible as the one in Ruby.

Contrived example follows:

1
2
var test = "ball";
var ball = test.Into(s => "Are we having a " + s + " yet?");


The odd part

The Odd(x) method call in the calculation above is a plain static method.

1
2
private static bool Odd(int n)
{ return (n % 2 != 0); }


If you want an even more terse syntax you could try an ext. method on IEnumerable like this:

1
2
public static IEnumerable<int> Odd(this IEnumerable<int> en)
{ return en.Where(n => n % 2 != 0); }


Gives:


var result = Enumerable.Range(1, 100).Odd().Sum().Into(x => x * x);


In C# I don't think it's possible to pull off the Symbol#to_proc stuff that Ruby does. That's the &: in the select(&:odd?) and the inject(&:+) in the Ruby example. Raganwald has a great post on that too.

Edit

Check out Jon Skeet's nice answer on StackOverflow to my question on how to make this even more Ruby-like. I have to try out that Operator class later though.

Edit 2009-10-07

One thing I found a bit surprising is that by implementing the Into ext. method in this way it not only works for all objects based on System.Object but it also works for value types.

1
2
3
4
int n=4711;
int oddOrZero = n.Into(x => x % 2 !=0 ? x : 0); // 4711
n = 4712;
oddOrZero = n.Into(x => x % 2 != 0 ? x : 0); // 0


Edit 2009-10-12

My confusion did stem from my lack of understanding of extension methods. Ex. methods are in fact not extending System.Object or any other type, they are "nothing more than a pleasant syntax for calling a static method" in case no instance method with the same name can be found.

Older posts: 1 2 3 4 5 6 ... 12