Inspired by randomness 6

Posted by Jonas Elfström Thu, 27 Oct 2011 21:19:00 GMT

Inspired by Randomly not so random I decided to play around with the subject.

Both Java and C# uses a linear congruential generator as their pseudorandom number generators. I found this at MSDN

The current implementation of the Random class is based on a modified version of Donald E. Knuth's subtractive random number generator algorithm."

but I couldn't find any information on what values Microsoft uses for the m, a, and c parameters in their LGC implementation.

The German Federal Office for Information Security (BSI) has established four criteria for quality of deterministic random number generators. They are summarized here: K1 — A sequence of random numbers with a low probability of containing identical consecutive elements.

From the pseudorandom number generator article on Wikipedia. Let's see what we can do about that.

1
2
3
4
5
var random = new Random(116793166);
for (int i = 0; i < 9; i++)
{
    Console.Write(random.Next(10) + " ");
}


Outputs:

1 1 1 1 1 1 1 1 1

Did I just break the K1 criteria above? You know I didn't but you might be interested in how I found the seed number. It's easy but also quite computational intensive, that's why I used Parallel.For.

1
2
3
4
5
6
7
8
9
10
11
12
Parallel.For(0, int.MaxValue, i =>
{
    var rnd = new Random(i);
    bool allSame = true;
    for (int j = 0; j < 9; j++)
    {
        allSame = allSame && rnd.Next(10) == 1;
        if (!allSame) break;
    }
    if (allSame)
        Console.WriteLine(i); 
});


It took a couple of minutes for the number 116793166 to show up in my console.

To keep on playing I needed a random string generator. Mine looks like this.

1
2
3
4
5
6
7
8
9
private const string _chars = "abcdefghijklmnopqrstuvwxyz";

private static string RandomString(int size, Random rng)
{
    var buffer = new char[size];
    for (int i = 0; i < size; i++)
        buffer[i] = _chars[rng.Next(_chars.Length)];
    return new string(buffer);
}


If the random generators were the same it looks like it should give the same result as the Java one but RandomString(5, new Random(-229985452)) returns tfsld. Either my RandomString method works different than the Java one or it's the case that Java and .NET has slightly different settings of their LGCs.

Mathematically there's an infinite amount of inputs that results in a returned hello, for instance 1382472294, but here we are limited by the size of our integers.

I found the seed 1382472294 with the following little method and loop

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
private static bool CompareToRandomString(string str, Random rng) 
{ 
    for (int i = 0; i < str.Length; i++)
    {
        if (_chars[rng.Next(_chars.Length)] != str[i])
            return false;
    }
    return true;
}

...

const string lookingFor = "hello";

Parallel.For(0, int.MaxValue, i =>
{
    var rnd = new Random(i);
    if (CompareToRandomString(lookingFor, rnd))
        Console.WriteLine(i);
});


I leave it up to you to implement a break out of that loop.

Ruby does not use a LGC but a Mersenne twister instead. It's still only pseudorandom so there's no problem finding patters you like and being able to repeat them.

1
2
srand(2570940381)
9.times { print "%d " % rand(10) }

Outputs:
1 2 3 4 5 6 7 8 9

1
2
3
charset=('a'..'z').to_a
srand(124931)
puts (0...4).map{ charset.to_a[rand(charset.size)] }.join

Outputs:
ruby

A strict directed graph

Posted by Jonas Elfström Sat, 23 Oct 2010 22:39:00 GMT

Recently N. asked a question on a list that made my geek-senses tingle. He had a test with 12 levels of questions. The testee should answer 10 questions. If a question was answered correctly the level would go up and if it was not correct it would go down. If a level 12 was answered correctly, another level 12 would be asked and vice versa. The first question would be a level 6. N. wanted to know how many questions of each level he needed.

Pretty soon he got a correct answer and several wrong. One of the incorrect answers was, and I don't like to admit this, provided by me. I took the recursive route (which I very seldom do) and then tried to present the result in ASCII. I failed. The result from the algorithm in itself was correct but the presentation was simplified so much it failed to deliver the correct answer. Here it is:

             06
            05 07
          04 06 08
         03 05 07 09
       02 04 06 08 10
      01 03 05 07 09 11
    01 02 04 06 08 10 12
   01 02 03 05 07 09 11 12
 01 02 03 04 06 08 10 11 12
01 02 03 04 05 07 09 10 11 12

The problem is that this result can be read as if the testee could get two level 2 questions in a row but the rules forbids that.

Then I remembered reading about Graphviz and especially Ruby-Graphviz. That's what I used to generate the graph to the right. The graph itself is nothing but a visualization of all the possible question sequences. To find the answer that N. was looking for you can follow a level from top to bottom and count the number of times it can occur. I have to admit that it's far from the most convenient way but it works.

The result
5 of level 1, 5, 6 & 7.
4 of level 3, 4, 8, 9 & 12.
3 of level 2, 10 & 11.

I had to jump a few hurdles to get there. First my old Ubuntu-install, for unknown reasons, had some ancient version of libfreetype in /usr/local/lib/ that I had to remove. In the end an rm /usr/local/lib/libfreetype* would have been good. I also did a sudo ldconfig but I'm not sure that was necessary. Let's just hope you're on a modern system where a sudo apt-get install graphviz followed by sudo gem install ruby-graphviz will be enough.

The recursive code I wrote produces every single path the testee can take. A graph of that would be quite big since it's 1023 nodes. I then found out about strict digraph in the Graphviz DOT language. It sounded like a perfect fit and it was! One problem though, Ruby-Graphviz didn't seem to know about it. The power of open source software let me patch ruby-graphviz-0.9.18/lib/graphviz/constants.rb so that GRAPHTYPE included it

1
2
3
4
## Const: graphs type
 GRAPHTYPE = [
   "digraph", "graph", "strict digraph"
 ]


Another awesome power of open source is Grégoire Lejeune because only hours after me contacting him he added it to the GitHub repository.

Here's the code that produced the Rooted Binary Directed Acyclic Graph (or is it Rooted Directed Binary Acyclic Graph?).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
require 'rubygems'
require 'graphviz'

@min_level=1
@max_level=12
@max_depth=10
start_level=6

@g = GraphViz.new(:G, :type => "strict digraph" )

def add_node(level, depth, parent)
  if depth<@max_depth
    current=[level, depth].join(",")

    sub=level<=>@min_level
    add=@max_level<=>level
    add_node(level-sub, depth+1, current)
    add_node(level+add, depth+1, current)

    @g.add_node(current).label=level.to_s
    @g.add_edge(parent, current) unless parent=="00"
  end
end

add_node(start_level, 0, "00")

@g.output( :png => "graph.png" )


With the current parameters this problem might not be the most exciting but I believe that by just modifying them slightly we would, again, reach a wall of complexity.

Finding primes in parallel 18

Posted by Jonas Elfström Thu, 14 Jan 2010 21:55:00 GMT

Justin Etheredge has been blogging about his challenge to find prime numbers with LINQ. He later used AsParallel() (coming in .NET 4) to speed things up and then followed that up with a post about using The Sieve Of Eratosthenes.

As you can see in the comments of those posts I tried to speed the Sieve of Eratosthenes up by using Parallel.For in the inner loop. I also tried AsParallel() in the LINQ expression but it made no difference in either case. At most it got 5% faster. I'm not sure but it could be that because SoE is very memory intense we could have a scaling issue and maybe also memory bandwidth exhaustion. This is mere speculation.

I then searched for other algorithms and found The Sieve of Atkin. It uses less memory than SoE so I thought I'd give it a try.

I set the limit to 20,000,000 and then benchmarked it. It timed in on 2.48s so actually worse than the 2.2s that SoE took. Not good! Then I added Parallel.For in the loop that did most of the work and lo and behold, it scaled! I have two cores in my machine (T7200@2.0GHz) and the average runtime went down to 1.26s. That's almost linear and surprisingly good! If you happen have a quad core (or more) and feel like trying it out then please contact me. It would be interesting to see if it scales further.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
static List<int> FindPrimesBySieveOfAtkins(int max)
{
   //  var isPrime = new BitArray((int)max+1, false); 
   //  Can't use BitArray because of threading issues.

    var isPrime = new bool[max + 1];
    var sqrt = (int)Math.Sqrt(max);

    Parallel.For(1, sqrt, x =>
    {
        var xx = x * x;
        for (int y = 1; y <= sqrt; y++)
        {
            var yy = y * y;
            var n = 4 * xx + yy;
            if (n <= max && (n % 12 == 1 || n % 12 == 5))
                isPrime[n] ^= true;

            n = 3 * xx + yy;
            if (n <= max && n % 12 == 7)
                isPrime[n] ^= true;

            n = 3 * xx - yy;
            if (x > y && n <= max && n % 12 == 11)
                isPrime[n] ^= true;
        }
    });

    var primes = new List<int>() { 2, 3 };
    for (int n = 5; n <= sqrt; n++)
    {
        if (isPrime[n])
        {
            primes.Add(n);
            int nn = n * n;
            for (int k = nn; k <= max; k += nn)
                isPrime[k] = false;
        }
    }

    for (int n = sqrt + 1; n <= max; n++)
        if (isPrime[n])
            primes.Add(n);

    return primes;
}

This code needs C# 4.0 to compile.

Edit 2010-12-14

Dommer found out that the BitArray implementation had some serious threading issues. I had my worries about the non thread safe characteristics of BitArray but I thought that the isPrime[n] ^= true; was an atomic operation and that it didn't matter in what order bit bits was flipped would make it possible to use anyway. Not so. Changed it to a boolean array and that seems to rock the boat but of course at a much higher memory cost.

Edit 2010-01-20

Indications are that this does in fact not scale very good on a quad core. It's even worse, it seems it scales good on my old T7200 but not on a dual core E6320. I don't know why but of course the shared state of the isPrime BitArray is a huge problem and maybe it could be that differences in CPU architecture (FSB speed, caches and so on) in the E6320 is an explanation. Average execution time on the E6320 was 1290ms in a single thread and 1064ms in two.

If you want to try this in an older version of C# than 4.0 then check out this post.

A reader asked how I timed the executions. Here's how.

1
2
3
4
5
6
7
8
9
10
11
12
13
var steps = new List<long>();
var watch = new Stopwatch();

for (int i = 0; i < 10; i++) 
{
    watch.Reset();
    watch.Start();
    var primes = FindPrimesBySieveOfAtkins(20000000);
    watch.Stop();
    Console.WriteLine(watch.ElapsedMilliseconds.ToString());
    steps.Add(watch.ElapsedMilliseconds);
}
Console.WriteLine("Average: " + steps.Average().ToString());



Edit 2010-10-24

Tom's code from the comment below

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
using System; 
using System.Collections.Generic; 
using System.Linq; 
using System.Numerics; 
using System.Text; 
using System.Threading.Tasks;

namespace Calculate_Primes 
{ 
    class Program 
    { 
        private const int _NUMBER_OF_DIGITS = 100;

        static void Main(string[] args)
        {
            BigInteger floor = BigInteger.Parse("1" + string.Empty.PadLeft(_NUMBER_OF_DIGITS - 1, '0'));
            BigInteger ceiling = BigInteger.Parse(string.Empty.PadLeft(_NUMBER_OF_DIGITS, '9'));

            Console.WindowWidth = 150;

            //var primes = Enumerable.Range(floor, ceiling).Where(n => Enumerable.Range(1, n).Where(m => (n / m) * m == n).Count() == 2);

            Console.Clear();
            _calculatePrimes(floor, ceiling, "C:\\100 digit primes.txt");

            Console.Clear();
            _calculatePrimes(floor, ceiling, "C:\\300 digit primes.txt");
        }

        static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toInclusive)
        {
            for (BigInteger i = fromInclusive; i <= toInclusive; i++)
                yield return i;
        }

        static void ParallelFor(BigInteger fromInclusive, BigInteger toInclusive, Action<BigInteger> body)
        {
            Parallel.ForEach(Range(fromInclusive, toInclusive), body);
        } 

        static void _calculatePrimes(BigInteger floor, BigInteger ceiling, string resultsFileFilepath)
        {
            using (System.IO.FileStream fs = new System.IO.FileStream(resultsFileFilepath, System.IO.FileMode.Create)) { }

            using (System.IO.StreamWriter sw = new System.IO.StreamWriter(resultsFileFilepath))
            {
                ParallelFor(floor, ceiling, i =>
                    {
                        if (_isPrime(i))
                        {
                            lock (sw)
                            {
                                sw.Write(i.ToString() + System.Environment.NewLine);
                                sw.Flush();
                            }
                        }
                    });
            }
        }

        static bool _isPrime(BigInteger number)
        {
            bool returnValue = true;

            Console.WriteLine("Checking {0} for primality.", number.ToString());

            if ((number < 2) || (number > 2 && number.IsEven) || (number > 2 && number.IsPowerOfTwo))
                returnValue = false;
            else
                for (BigInteger i = 2; i * i <= number; i++)
                {
                    if (number % i == 0)
                        returnValue = false;
                }

            if(returnValue)
                Console.WriteLine("         {0} IS prime.", number.ToString());
            else
                Console.WriteLine("         {0} IS NOT prime.", number.ToString());

            return returnValue;
        }
    }
}

Infinite ranges in C# 2

Posted by Jonas Elfström Tue, 20 Oct 2009 18:41:00 GMT

I recently learned that C# is compliant with the IEEE 754-1985 for floating point arithmetics. That wasn't a big surprise but that division by zero is defined as Infinity in it was! It actually kind of bothers me that I didn't know this.

In mathematics division by zero is undefined for real numbers but I guess Infinity is a more pragmatic result. Or as a friend put it "IEEE stands for Institute of Electrical and Electronics Engineers not Institute of Mathematics"

1
2
3
4
double n = 1.0;
n = n / 0;
if (n > 636413622384679305)
  System.Console.WriteLine("Yes it certainly is!");


This C# code does not throw an exception, it simply leaves n defined as Infinity and writes a line to the console.

Ruby is also IEEE 754-1985 compliant. It even lets you define infinite ranges.

1
2
3
4
5
6
Infinity=1.0/0
=>Infinity
(1..Infinity).include?(162259276829213363391578010288127)
=> true
(7..Infinity).step(7).take(3).inject(&:+) # 7+14+21
=> 42


I can't say I see very much use of this but it brings a kind of completeness to the handling of infinities. Unfortunately it seems we don't get that in C# out of the box because Enumerable.Range takes <int>,<int> as parameters and there's no Infinity definition for int. That's unless someone wrote a generic Range class. Turns out none other than Jon Skeet did in his MiscUtil. Download MiscUtil and then by using MiscUtil.Collections; you can:

1
2
3
4
5
6
double n = 1.0;
var infinity = n / 0;
var r = new Range<double>(0, infinity);
if (r.Contains(4711))
  System.Console.WriteLine("Yes it certainly does!");
var sum = r.Step(7.0).Take(3).Sum();


And guess what, it works like a charm! 4711 is part of positive infinity and sum is 42.0 and all is good.

Edit

There's also a couple of predefined constants. Thanks to Eric for pointing that out.

1
2
var r = new Range<double>(7,  System.Double.PositiveInfinity);
var sum = r.Step(7.0).Take(3).Sum();


Sierpinski's shoes 4

Posted by Jonas Elfström Wed, 14 Nov 2007 22:50:00 GMT

There were no cross-platform windowing toolkits for Ruby so _why made one and he calls it Shoes. Not even close to 1.0, it's already yummy in a chunky kind of way and since it came from _why I simply had to try it out. Something simple.

Shoes.app :width => 1024, :height => 768 do  
  corners = [ {:x => 256, :y => 10}, {:x => 12, :y => 378}, {:x => 506, :y => 378} ]
  xpos,ypos,c = 256,10,0
  srand
  2111.times do
      c=rand(3)
      xpos += (corners[c][:x]-xpos)>>1
      ypos += (corners[c][:y]-ypos)>>1
      star xpos, ypos, 5, 10
  end
end

The result.

Older posts: 1 2