Programming and Debugging (in my Underhøøsen): 2009

Tuesday, June 23, 2009

The Power of Foreach

In D, arrays can be traversed using the foreach construct:


int [] a = [1, 5, 10, 42, 13];
foreach (i;a) {
    writefln(“%d”, i);
}

The array of integers in this example is traversed and each element printed, in the natural order in which elements appear in the sequence. To visit the elements in reverse order, simply replace foreach with foreach_reverse. It is as intuitive as it gets.

Moreover, linear searches can be implemented with foreach: simply break out of the loop when the searched-for value is found:


foreach (i;a) {
    writefln(“%d”, i);
    if (i == 42) {
        break;
    }
}

Consider now a tree data structure where a tree node is defined as:


class TreeNode {
 TreeNode left;
 TreeNode right;
 int value;
}

What is the meaning of a statement such as foreach (node; tree) { … }? The simple answer is that with the above definition of TreeNode, the code does not compile.

But if it were to compile, what should it do? Visit the nodes in order, or in a breadth-first fashion? No answer is the right one, unless we get to know more about the problem at hand. If we’re using a binary search tree to sort some data, then foreach would most likely visit the nodes in-order; if we’re evaluating a Polish-notation expression tree, we might want to consider post-order traversal.

Foreach Over Structs and Classes

Tree data structures and tree traversal occur often in computer science problems (and nauseatingly often in job interviews). Balanced binary trees are routinely used to implement associative containers, as C++ programmers are certainly familiar with the standard template collections set and map.

One difference between sequence containers (such as lists, arrays, and queues) on one hand, and containers implemented with trees on the other, is that there are more ways to iterate over the elements of a tree than there are ways to enumerate the elements of a sequence. A list (for example) can be traversed from begin to end and, in the case of double-linked lists, in reverse, from the end to the beginning; that’s it. But a tree can be traversed in order, in pre-order, post-order, or breadth first.

No built-in traversal algorithm will fit all possible application requirements. D’s approach is to provide the opApply operator as "standard plumbing" where users can plug their own algorithms for iterating over the elements of a class or struct. The operator is supposed to implement the iteration logic, and delegate the actual processing of the objects to a ... delegate:


class TreeNode {
public:
    TreeNode left;
    TreeNode right;
    int value;
    int opApply(int delegate(ref TreeNode) processNode) {
        // ... tree traversal
        // ...
        return 0;
    }
}

When the programmer writes a foreach loop, the compiler syntesizes a delegate function from the body of the loop, and passes it to opApply. In this example, the body of the delegate will have exactly one line that contains the writefln statement:


TreeNode tree = constructTree();
foreach(node; tree) {
    writefln(“%d”, node.value);
}

For an in-order traversal, the implementation of opApply may look something like this:


int opApply(int delegate(ref TreeNode) processNode) {
    return (left && left.opApply(processNode)) 
        ||  processNode(this)
        || (right && right.opApply(processNode));
}

The delegate that the compiler synthesizes out of the foreach body returns an integer (which is zero by default). A break statement in the foreach loop translates to the delegate function returning a non-zero value. As you can see in code above, a correct implementation of opApply should make sure that the iteration is "cancelled" when the delegate returns a non-zero value.

The traversal function’s argument must match the delegate argument type in the signature of opApply. In the example above the processNode function could modify the tree node that is passed in. If the TreeNode class writer wanted to outlaw such use, the opApply operator should have been declared to take a delegate that takes a const TreeNode parameter:


class TreeNode {
// …
    int opApply(int delegate(ref const TreeNode) processNode) {
        return processNode(this);      
    }
}

The new signature demands that the client code changes the parameter type from TreeNode to const TreeNode. Any attempt to modify the node object from within the user-supplied traversal function will fail to compile.

Another possible design is to encode all traversal algorithms as TreeNode methods. The following shows an example for the in-order algorithm (other traversal algorithms are left as an exercise for the reader):


class TreeNode {
// …
    int traverseInOrder(int delegate(ref int) dg) {
        if (left) {
            int r = left.traverseInOrder(dg);
            if (r) {
                return r;
            }
        }
        int r = dg(value);
        if (r) {
            return r;
        }
        if (right) {
            r = right.traverseInOrder(dg);
            if (r) {
                return r;
            }
        }
        return 0;
    }
}

foreach(val; &tree.traverseInOrder) {
    Console.WriteLine(val);
}

D Generators

A generator is a function or functor that returns a sequence, but instead of building an array or vector containing all the values and returning them all at once, a generator yields the values one at a time. Languages such as C# and Python have a yield keyword for this purpose. In D a generator can be implemented with foreach and a custom opApply operator. Assume one wants to print the prime numbers up to N, like this:


    foreach (i; PrimeNumbers()) {
        if (i > N) {
            break;
        }
        writeln(i);
    }

To make this work, the PrimeNumbers struct could be implemented like this:


struct PrimeNumbers {
    int n = 1;
    int primes[];

    int opApply(int delegate(ref int) dg) {
loop:
        while (true) {
            ++n;
            foreach (p; primes) {
                if (n % p == 0) {
                    continue loop;
                }
            }
            primes ~= n;
            if (dg(n)) {
                break;
            }
        }
        return 1;
    }
}

Thursday, June 18, 2009

Pragma Assembly

Publishing the source code for D.NET on CodePlex in its current (rough) form turned out to be a great idea, as I received very good feedback. Tim Matthews of New Zealand has submitted several bug reports and patches and convinced me to change my stance on the "assembly class hack".

The hack was a very limited solution to the problem of qualifying imported declarations by the name of the assembly where they live. I described the problem and the attempt to hack around it at the end of a post back in December; a diligent reader commented that I should have used the pragma mechanism instead. I resisted the suggestion at the time, mainly because I am trying to avoid changing the compiler front-end if I can help it. (Front-end changes have to ultimately be merged back into Walter' s source tree, and he is a very very busy guy.)

A D implementation on .NET is not going to be of much use without the ability to generate code that imports and interfaces with existing assemblies. Guided by this idea, Tim Matthews did a lot of trail-blazing and prototyping and showed that because in .NET namespaces can span across assemblies, there has to be a way of specifying an arbitrary number of assemblies in one import file. My "assembly class" hack allowed for one and only one assembly name to be specified.

So I had to byte the bullet and do the right thing: assembly names are now specified with a pragma, like this:


pragma (assembly, "mscorlib")
{
// imported declarations here...
}

Any number of such blocks can appear within a D import file. And in the future, the code will be extended to allow a version and a public key to be specified after the assembly name.

Another thing that has to be fixed is to make the compiler adhere to the convention of using a slash between enclosing classes and nested types, as described in Serge Lidin's Expert .NET 2.0 IL Assembler, CHAPTER 7 page 139:

Since the nested classes are identified by their full name and their encloser (which is in turn identified by its scope and full name), the nested classes are referenced in ILAsm as a concatenation of the encloser reference, nesting symbol / (forward slash), and full name of the nested class

Tim also submitted a front-end patch that allows directories and import files of the same name to coexist, so that the code below compiles without errors:


import System;
import System.Windows.Forms;

An alternative solution that I proposed was to create import files named "this.d", so that the above would have read:


import System.this;
import System.Windows.Forms;

After some consideration, I bought Tim's point that this is not what .NET programmers would expect, and went on to apply his patch.

Sunday, May 10, 2009

D .NET on Codeplex

I will be taking a few days off next week and I decided to upload the code for the D compiler .net back-end on Codeplex before I close shop. I hope that it will provide context for the last few months worth of blog posts.

Most core language features are usable, but there's no Phobos port and if you need to import functionality from external DLLs, you'll have to hand-write some import files (following the model in druntime/import/System.di); I hope to get around to writing a tool that automates the process one of these days -- and it will most likely be in D.

There is a Visual Studio 8 (works with Visual Studio 8 Express) solution file in the source tree, and there is also a Makefile for the adventurous. The project compiles on Linux but it does not work quite well with Mono -- I strongly suspect it is Mono's segfault.

Check it out at http://dnet.codeplex.com/!

Tuesday, April 28, 2009

Argument against _argptr

Variadic functions work slightly different in my D.NET implementation than under the native D compiler.

For functions with variable numbers of arguments, the native compiler synthesizes two parameters: _arguments and _argptr; _arguments is an array of TypeInfo objects, and _argptr is a pointer to the beginning of the variable arguments on the stack. The user is supposed to query the type information in _arguments, and do the proper pointer arithmetic to navigate the arguments. You can see some examples at http://www.digitalmars.com/d/2.0/function.html:


void printargs(int x, ...)
{
    writefln("%d arguments", _arguments.length);
    for (int i = 0; i < _arguments.length; i++)
    {   _arguments[i].print();

        if (_arguments[i] == typeid(int))
        {
            int j = *cast(int *)_argptr;
            _argptr += int.sizeof;
            writefln("\t%d", j);
        }
        else if (_arguments[i] == typeid(long))
        {
            long j = *cast(long *)_argptr;
            _argptr += long.sizeof;
            writefln("\t%d", j);
        }
        // ...

The pointer arithmetic is not verifiable in managed code. A separate array of type descriptors is not necessary in .net, because the type meta-data can be passed in with the arguments.

In D.NET, the variable arguments are passed as an array of objects. For example, for a D function with the prototype

void fun(...)

the compiler outputs:


.method public void '_D23funFYv' (object[] _arguments)

I handled variadic support slightly differently from the native compiler: I dropped _argptr and provided a new helper function, _argtype, that can be used as demonstrated in this example:


void fun(...)
{
    foreach(arg; _arguments)
    {
        if (_argtype(arg) == typeid(int))
        {
            int i = arg;
            Console.WriteLine("int={0}".sys, i);
        }
        else if (_argtype(arg) == typeid(string))
        {
            string s = arg;
            Console.WriteLine(s.sys);
        }
    }
}

If the type of the arguments is known, there is no need to check for the typeid:


void fun(...)
{
    foreach(arg; _arguments)
    {
        int i = arg;
        Console.WriteLine(i);
    }
}

If an incorrect type is passed in, it is still okay, because the error is detected at runtime.


fun("one", "two", "three"); // int i = arg will throw

The downside of this approach is that it is not compatible with the native code. This does not affect template variadic functions, which should be perfectly portable.

Slice and D-ice

It is amazing how much insight one can get into a language by simply writing a compiler for it... Today I am going to spend half a lunch break copying and pasting into this post a stack of notes related to array slices (collected over the past few months of working on a .net back-end for the D compiler).

D sports a data type called array slice that is intended to boost the performance of array-intensive computations. A slice is a lightweight value type, conceptually equivalent to a range of array elements, or a "view" into an array. It can be thought of as consisting of a reference to an array, and a pair of begin-end indexes into the array:


struct (T) ArraySlice {
    T[] a; // reference to array
    int begin; // index where slice begins
    int end; // one past index where slice ends
}

The actual representation of a slice is currently internal to the compiler, and completely opaque to the programmer. The template struct above is not what the layout of a slice actually looks like, it is intended for illustrative purposes only.

Consider this code:


int a[] = [1, 3, 5, 7, 9, 11, 13, 17];
int[] s = a[2..5];

The second declaration introduces "s" as a slice of array "a", starting at position two and ending at (but not including) the fifth element. Using the template pseudo-definition of ArraySlice, the code is conceptually equivalent to:


int a[] = [1, 3, 5, 7, 9, 11, 13, 17];
ArraySlice!(int) s = { a, 2, 5 };

To understand how array slices may help performance, consider an application that reads and parses XML files. The input can be loaded as an array of characters (a huge string). The application builds a DOM representation of the input, and each node in the DOM contains a token string. This approach is wasteful, because the token strings hold copies of character sequences that are already present in the input; copying tokens around has a linear complexity (it is directly proportional with the number of characters in the token) and the same is true for the spatial complexity (how much memory is being used). But XML tokens could be modeled as slices of character arrays ("views" into the original XML input string), and complexity in both time and space would drop down to a constant value.

This design can be implemented in other languages than D, but memory management issues may add unnecessary complexity. In C++ for example we'll have to make sure that the life-time of the token slices does not exceed the scope of the original input string. D belongs to the family of garbage-collected languages; by default, objects are reference-counted, and holding a reference to an array slice indirectly keeps the original array "alive", because the slice contains a reference to it.

Now that the design rationale behind array slices is understood, let's take another look at the syntax. You have probably noticed that in the statement:
int[] s = a[2..5];

The declaration part introduces "s" as an array of integers; it is not until you see the assignment that the lightweight, array slice, true nature of "s" is revealed.
D has no special syntax for disambiguating between "true" arrays and array slices in declarations; they can be used interchangeably. As a matter of fact, a function signature with array parameters will happily accept a slice as argument. In the following code, both "a" and "s" are legal arguments for the count function:


int count(int[] arr) {
    return arr.length; // return the number of elements in arr
}
writefln("%d", count(a)); // ok
writefln("%d", count(s)); // also ok

Resizing Slices

Both arrays and slices support the built-in length property. As you expect, an expression such as a.length tells you how many elements are present in the array; the property applies to array slices as well, and in that case it gives the number of elements within the slice range. For example, the output of the following code is "3":


int a[] = [1, 2, 3, 5, 7, 9, 11, 13, 17];
int[] s = a[2..5];
writefln("%d", s.length); // prints 3

So far so good, but I forgot to mention that the length property is read-write: not only can you query the size of an array, you can also change it, like this:


a.length = 100; // extend the array to 100 elements

Assignment to the length property resizes the array. This begs the question: what happens when an array slice is being resized? The answer of course is "it depends".
With "a" and "s" defined as above, let's say we resize the "s" slice from 3 elements to 7:


s.length = 7;

This extends the view of "s" into "a" up to the ninth element of "a" ("s" starts at 2). It is as we have said:


s = a[2..9];

The slice is still within the bounds of the "a" array. Resizing it is a constant-time operation that changes one field inside the internal representation of "s". If instead of the built-in slice we had used the template ArraySlice struct introduced above, resizing the slice would have amounted to:


int a[] = [1, 3, 5, 7, 9, 11, 13, 17];
ArraySlice!(int) s = { a, 2, 5 };
s.end = 9; // resize the slice

Because "s" is simply a "view" of the array, modifying an element in the array is immediately reflected into the slice, for example:


writefln("%d %d", a[2], s[0]); // prints "5 5"
a[2] = 23;
writefln("%d %d", a[2], s[0]); // prints "23 23"

What happens if we re-size "s" past the end of "a"?


s.length = 100; // what does this do?

The answer is that the behavior is up to the compiler. The current native compiler from Digital Mars changes the type of "s" from a lightweight view into an array to a full-fledged array, and re-sizes it to fit 100 elements.


int a[] = [1, 2, 3, 5, 7, 9, 11, 13, 17];
int[] s = a[2..5];
writefln("%d %d", a[2], s[0]); // prints "5 5"
s.length = 100;
a[2] = 23;
writefln("%d %d", a[2], s[0]); // prints "23 5"

In other words, resizing a slice past the end of the original array breaks up the relationship between the two, and from that point they go their separate merry ways.
This behavior also underlines the schizophrenic nature of array slices that are not full copies of arrays, unless they change their mind.

Concatenating Slices

We saw how array slices may be re-sized via the “length” property. Slices may be re-sized implicitly via concatenation, as in the following example:


int a[] = [1, 2, 3, 5, 7, 9, 11, 13, 17];
int s[] = a[0..5];
s ~= [23, 29];

The tilde is the concatenation operator for arrays, strings and slices. In this example, the slice is resized to accommodate the elements 23 and 29. Note that even that in this situation the resizing is different from had we written:
s.length += 2;

Extending the length by two elements simply bumps up the upper limit of the slice (because there is still room in the original array, "a"). As we saw in the previous section, if the new length exceeds the bounds of the original array, the slice will be "divorced" from the original array, and promoted from a light-weight view to a full, first-class array. If we just extend the length by two elements, the bounds of "a" are not exceeded.

However, in the case of appending (as in s ~= [23, 29]) in addition to resizing we are also setting the values of two additional elements. The slice needs to be divorced from the array, so the a[5] and a[6] are not overwritten with the values 23 and 29. The compiler turns "s" into a full array of length + 2 == 7 elements, copies the elements of “a” from 0 to 5, then appends values 23 and 29.

The problem, as with resizing past the bounds of the original array, is that after the array and the slice part their ways, it is no longer possible to modify value types in the original array via the slice (which has now been promoted to a standalone array). This is a run time behavior, hard to predict by statically analyzing (or visually inspecting) the code.

Rolling Your Own Array Slices

It is impossible to determine statically whether a D function parameter is an array or a slice by examining the function's code alone. It is up to the function's caller to pass in an array or a slice.


void f (int[] a) {
// ...
}
int a[] = [1, 2, 3, 5, 7, 9, 11, 13, 17];
f(a); // call f with an array argument
f(a[2..5]); // call f with a slice argument;

In some cases you may want to better communicate that your function is intended to work with slices rather than arrays. You may also want to have better control over the slice's properties. Say for example you want to make sure that a slice is never re-sized.

You can accomplish these things by rolling your own ArraySlice. The template struct that was introduced earlier is a good starting point. The signature of "f" can be changed to:


void f (ArraySlice!(int) a) { // ...

That's a good start, but the struct is not compatible with a built-in array slice. The following code does not compile:


struct (T) ArraySlice {
    T[] a; // reference to array
    int begin; // index where slice begins
    int end; // one past index where slice ends
}

int a[] = [1, 2, 3, 5, 7, 9, 11, 13, 17];
ArraySlice!(int) s = { a, 2, 5 };
foreach(i; s) { // error, does not compile
    writefln("%d", i);
}

You could of course rewrite the foreach loop to use the begin..end range:


foreach(i; s.begin..s.end) {
    writefln("%d", s.a[i]);
}

In addition to being more verbose such code is not very well encapsulated, since it explicitly accesses the struct's public members. If we later decide to factor out ArraySlice into its own module, and make the “a”, “begin”, and “end” members private, the code above will not compile anymore.

All that's preventing the compact version of the foreach loop from compiling is that the opApply operator is missing. So let's add one:


struct (T) ArraySlice {
  // ...
  int opApply(int delegate(ref int) dg) {
      foreach (i;begin..end) {
          dg(a[i]);
      }
      return 0;
  }
}

Great! This gets us past the compilation error. The foreach loop now compiles and prints out all the elements in the slice. There is a small and subtle bug in this code though. Suppose that instead of printing all elements in the slice, you're doing a linear search, for example:


foreach(i; s) {
    if (i == 5) { // found it!
        break;
    }
}

Astonishingly enough, this code will not break out of the foreach loop, instead it will continue through all the elements in the slice.

The http://www.digitalmars.com/d/2.0/statement.html website prescribes the behavior of the opApply operator:

"The body of the apply function iterates over the elements it aggregates, passing them each to the dg function. If the dg returns 0, then apply goes on to the next element. If the dg returns a nonzero value, apply must cease iterating and return that value".

The D compiler synthesizes the delegate function from the body of the foreach loop, and the code above is transformed internally to this equivalent form:


int dg(ref int i) {
    if (i == 5) {
        return 1;
    }
    return 0;
}
s.opApply(&dg);

The bug in the opApply operator is that the loop should be broken out of when dg returns non-zero:


int opApply(int delegate(ref int) dg) {
 foreach (i; begin..end) {
     if (dg(a[i])) break;
 }
 return 0;
}

Now foreach works correctly. To support foreach_reverse, just add a opApplyReverse member function to the ArraySlice template struct:


int opApplyReverse(int delegate(ref int) dg) {
 foreach_reverse (i; begin..end) {
     if (dg(a[i])) break;
 }
 return 0;
}

What about manipulating the length of the slice? Neither of the lines below compiles:


writefln("%d", s.length);
s.length = 100;

To support the length property, we have to add these methods:


struct (T) ArraySlice {
    // ...
    // return the length of the slice
    int length() { return end - begin; }

    // resize the slice, preventing it to grow past
    // the original array's length
    void length(int newLength) {
        end = begin + newlength;
        if (end > a.length) { end = a.length; }
    }
}

To prevent resizing the slice, all there is to do is to leave the second overload undefined, effectively turning length into a read-only property.

Clients of the struct can still set its individual members to inconsistent values. To disallow incorrect usage, the "a", "begin", and "end" members can be made private (and the struct will have to move to its own module, because in D private access is not enforced if the class or struct lives in the same module as the client code).

To make the ArraySlice struct even more source-compatible with built-in slices, you can give it an opIndex operator:


struct (T) ArraySlice {
    // ...
    T opIndex(size_t i) { return a[i]; }
}

The opIndex operator allows this to work:
writefln("%d", s[3]);

Assignment to array elements is not allowed:


s[3] = 42; // error, does not compile

If you want the array elements to be modified via the slice like that, just define
opIndexAssign:


struct (T) ArraySlice {
    // ...
    void opIndexAssign(size_t i, T val) { a[i] = val; }
}

When you put it all together, the ArraySlice struct will look something like this:


struct (T) ArraySlice {
private:
    T[] a; // reference to array
    int begin; // index where slice begins
    int end; // one past index where slice ends
public:
    T opIndex(size_t i) { return a[i]; }
    void opIndexAssign(size_t i, T val) { a[i] = val; }
    int length() { return end - begin; }
    // comment this function out to prevent resizing
    void length(int newLength) {
        end = begin + newlength;
        if (end > a.length) { end = a.length; }
    }

    // support foreach
    int opApply(int delegate(ref int) dg) {
        foreach (i;begin..end) {
            if (dg(a[i])) break;
        }
        return 0;
    }
}

In conclusion, by rolling your own array slice implementation the intent of your code becomes clearer, your level of control over it increases, and you can still retain the brevity of the built-in slices.

Friday, April 24, 2009

Static ctors in D.NET (Part 2)

D allows multiple static constructors per class (all sharing the same signature). For example, the following code is legal:


version(D_NET)
{
    import System;
    alias Console.WriteLine println;
}
else
{
    import std.stdio;
    alias writefln println;
}
class A
{
    static int i = 42;
    static this() 
    {
        println("static A.this 1");
    }
    static this() 
    {
        println("static A.this 2");
    }
}
void main()
{
    println(A.i);
}

The program prints:


static A.this 1
static A.this 2
42

Because IL does not allow duplicate methods with the same signature, instead of mapping static constructors directly to .cctor methods, my compiler generates one .cctor per class (where needed) that makes function calls to the static this() constructors. The .cctor is not called if the class is never referenced -- this behavior is different from the native Digital Mars D compiler. If we comment out the one line in main, it will still print the constructor messages in native mode, but not under the .net compiler.

D classes may also have one or more static destructors, as in this example:


class A
{
    static int i = 42;
    static this() 
    {
        println("static A.this 1");
    }
    static this() 
    {
        println("static A.this 2");
    }
    static ~this()
    {
        println("static A.~this 1");
    }
    static ~this()
    {
        println("static A.~this 2");
    }
}

Unlike with the class constructors, there is no special IL method to map static destructors to. My compiler supports them with AppDomain.ProcessExit event handlers, registered in reverse order of their lexical occurrences. IL allows non-member .cctor methods, and the compiler takes advantage of this feature to synthesize code that registers the static destructors as ProcessExit handlers.

It is interesting to observe that the global .cctor does reference the class when it constructs the event handler delegates:


.method static private void .cctor()
{
  // register static dtor as ProcessExit event handler
  call class [mscorlib]System.AppDomain [mscorlib]System.AppDomain::get_CurrentDomain()
  ldnull
  ldftn void 'example.A'::'_staticDtor4'(object, class [mscorlib]System.EventArgs)
  newobj instance void [mscorlib]System.EventHandler::.ctor(object, native int)
  callvirt instance void [mscorlib]System.AppDomain::add_ProcessExit(class [mscorlib]System.EventHandler)
  // register static dtor as ProcessExit event handler
  call class [mscorlib]System.AppDomain [mscorlib]System.AppDomain::get_CurrentDomain()
  ldnull
  ldftn void 'example.A'::'_staticDtor3'(object, class [mscorlib]System.EventArgs)
  newobj instance void [mscorlib]System.EventHandler::.ctor(object, native int)
  callvirt instance void [mscorlib]System.AppDomain::add_ProcessExit(class [mscorlib]System.EventHandler)
  ret
}

This means that the .cctor of the class will be called, even if no user code ever references it.

In addition to class static constructors and destructors, D also features module static constructors and destructors. These are expressed as non-member functions with the signature static this() and static ~this(), respectively.
For example:


//file b.d
import a;
version(D_NET)
{
    import System;
    alias Console.WriteLine println;
}
else
{
    import std.stdio;
    alias writefln println;
}

static this()
{
    println("module B");
    map["foo"] = "bar";
}
static this()
{
    println("boo");
}
static ~this()
{
    println("~boo");
}

//file a.d
version(D_NET)
{
    import System;
    alias Console.WriteLine println;
}
else
{
    import std.stdio;
    alias writefln println;
}

string map[string];

static this()
{
    println("module A");
}
static ~this()
{
    println("~module A");
}

void main()
{
    foreach (k, v; map)
    {
        version(D_NET)
        {
            Console.WriteLine("{0} -> {1}".sys, k, v.sys);
        }
        else
        {
            writefln("%s -> %s", k, v);
        }
    }
}

It is noteworthy that regardless in which order the two files above are compiled the resulting program prints the same output:


module A
module B
boo
foo -> bar
~boo
~module A

The explanation lay in the D language rules: if a module B imports a module A, the imported module (A) must be statically initialized first (before B).

As in the case of static constructors and destructors for classes, the compiler uses the global, free-standing .cctor method to stick calls to module ctors and register ProcessExit events that call the module's static dtors.

Thanks to BCSd for prompting this post with his comment and code sample.

Tuesday, April 14, 2009

Static Constructors in D.NET

The D programming language features static constructors that are similar to class constructors in C#: they are called automatically to initialize the static fields (shared data) of a class.

At the IL level, static constructors are implemented by the special .cctor methods. The experimental compiler for D that I am working on in my virtual garage groups together user-written static constructor code with static field initializers into .cctors (and I believe that the C# compiler does the same).


class Example {
  static int solution = 42; // assignment is moved inside .cctor
  static double pi;

  static this() { // explicit static ctor ==> .cctor
    pi = 3.14159;
  }
}

The code above produces the same IL as:


class Example {
  static int solution = 42;
  static double pi = 3.14159;
}

Also, the compiler synthesizes one class per module to group all "free-standing"global variables (if any).

For example, the IL code that is generated out of this D source


static int x = 42;

void main() {
  writefln(x);
}

is equivalent to the code generated for:


class ModuleData {
  static int x = 42;
}
void main() {
  writefln(ModuleData.x);
}

only that in the first case the ModuleData class is implicit and not accessible from D code. This strategy allows for the initializers of global variables to be moved inside the .cctor of the hidden module data class.

IL guarantees that the class constructors are invoked right before any static fields are first referenced. If the compiler flags the class with the beforefieldinit attribute, then the class constructors are called even earlier, i.e. right when the class is referenced, even if no members of the class are ever accessed (the C# compiler sets the beforefieldinit attribute on classes without explicit class constructors).

Serge Lidin explains all the mechanics in his great book Expert .NET 2.0 IL Assembler, and recommends avoiding beforefieldinit, on grounds that invoking a .cctor is a slow business. I am considering using it though, on the synthesized module data class.

In conjunction with a compiler-generated reference to the module data class, the beforefieldinit attribute will guarantee that the global variables are initialized on the main thread, and will avoid strange race conditions and bugs.

Saturday, April 04, 2009

No Teleprompter to Blame

I will never run for President of the United States. Not because I wouldn't like to, but because I can't: I am a Naturalized Citizen, one step below the Natural Born One. As depressing as this can be, there's a good side to it, too: I can always recant.

I have the luxury to lightheartedly declare "Folks, I don't know what I was smoking when I said that D structs cannot be implemented as value types in .net", without being afraid of losing any points in any poll.

Further research proved that my initial argument, in .net value types do not participate in garbage collection was... er irrelevant. That's because Walter Bright' s D compiler front-end is smart enough to insert calls to the structs' destructors wherever necessary! Now that's what I call a bright design. It doesn't matter that the CLR does not garbage-collect new-ed value types, because the front-end generates the code that deletes them.

I was running some compiler back-end tests for the D language post-blit operator when I realized that copying value types in IL is straight-forward; you LOAD the source variable / field / argument and STORE into the destination (boom!) whereas bit-copying managed classes is not as trivial.

I did not give up right away. Hanging on to my structs-as-classes implementation, I wrote a run-time helper blitting routine:


        using System.Runtime.InteropServices;

        // assumes src and dest are of the same type
        static public void blit(Object dest, Object src)
        {
            int size = Marshal.SizeOf(src.GetType());
            IntPtr p = Marshal.AllocHGlobal(size);
            try
            {
                Marshal.StructureToPtr(src, p, false);
                Marshal.PtrToStructure(p, dest);
            }
            finally
            {
                Marshal.FreeHGlobal(p);
            }
        }

It did not take long for the truth to dawn upon me: Wow, bit-blitting non-value types is a major pain in the (oh) bum (ah). Efficient that code ain't (or should I say ISN'T? man do those consonants hurt). Honestly, I am not even sure that code is kosher. Better not to get into a pickle if one can avoid it... so back to the struct-as-value type implementation I am.

Saturday, March 28, 2009

D for Programmers

I wrote a while ago about implementing thread support in the D .net compiler. The idea was to generate code that constructs delegates that are passed to the Start method of the System.Threading.Thread class. I discussed some details of constructing delegates out of nested functions, and I concluded that my next thread-related task was to implement support for the synchronized keyword.

Like the postman who goes for a walk in the park after coming home from work, I relax in the after hours by working on pet projects like C++ debuggers and D compilers. So this Saturday morning I sat down to implement the code generation for synchronized. The D language accepts two statement forms:

synchronized ScopeStatement
synchronized (Expression) ScopeStatement

The former can be easily reduced to the latter by synthesizing a global object and an expression that references it.

Here is a sample of D code that illustrates the use of the synchronized keyword:


import System;

class Semaphore {
  bool go;
}
Semaphore sem = new Semaphore;

void main() {
  void asyncWork() {
    while (true) {           // busy wait
      synchronized(sem) {
          if (sem.go) break;
      }
    }
  }
  Threading.Thread t = new Threading.Thread(&asyncWork);
  t.Start();
  synchronized(sem) {
    sem.go = true;
  }
  t.Join();
}

A synchronized statement can be transformed into the following pseudo-code:

object m = Expression();
try {
  lock(m);
  ScopeStatement();
}
finally {
  unlock(m);
}

Implementing lock / unlock maps naturally to the Enter / Exit methods of the System.Threading.Monitor class. That's really all there is to it, generating the method calls is trivial.

I was a bit disappointed by how easy the task turned out to be, but on the bright side I had plenty of time left to spend with my kid. I took him to Barnes and Noble to check out the Thomas trains and board books and the computer section, where I found the most helpful book title ever: "C# for Programmers". I guess no lawyer accidentally wandering through the computer book section can claim that he bought the book because the title was misleading. I wish that all book titles be so clear: "Finance for Accountants" or "Elmo Board Book for Toddlers".

Wednesday, March 25, 2009

Another Perl in the Wall

The most successful computer languages out there were born out of concrete problems: Perl in the beginning was nothing more than a Reporting Language; C came out of the need of writing OS-es in a portable fashion; PHP emerged of somebody's need for expressing dynamic web content as server-side macros. C# solves the problem of doing component programming without intricate COM knowledge and 2 PhD-s per developer.

Typically, after the problem is solved, and the engineers scratched their itch, wrote the code, and shipped the (working!) products, some rather academic type decides: "Now I am going to redo it the right way" (and that's how UNIX got rewritten as Plan 9, a great OS that nobody uses).

Interestingly enough, the redesigned products rarely end up being as successful as the original. People have been trying for years to replace things such as Windows, Office and the C programming language with "better" rewrites, but the market does not seem to care much. If it was good enough the first time around, it got adopted and used. Who cares how neat (or messy) the internal plumbings are?

The C and C++ languages are great for doing low level systems programming; they may be harder to use for constructing applications, and I would definitely advise against using C++ for web development. The D programming language is fancying itself as a better C++ and I think that is true in the application development area. But I do not see D as a systems language. I will never write a task scheduler in a garbage-collected language. When I write system level stuff, I want the good old WYSIWYG behavior: no garbage collector thread running willy-nilly and no strange array slices (that are like arrays except for when they aren't). And thanks but no thanks, I want no memory fences, no thingamajica inserted with parental care by the compiler to protect me from shooting my own foot on many-core systems. That is the point of systems programming: I want the freedom to shoot myself in the foot.

I have been trying (unsuccessfully) to argue with some folks on the digitalmars.d newsgroup that the new 2.0 D language should not worry much about providing support for custom allocation schemes. D is designed to help productivity: it relieves programmers from the grueling tasks of managing memory, and it encourages good coding practices, but a systems language it is not. We already have C, C++ and assembly languages to do that low-level tweak when we need it.

Sadly, some of the people involved with the design of D 2.0 are aiming for the moral absolute, rather than focus on shipping something that works well enough. I think it is a bad decision to allow for mixing the managed and unmanaged memory paradigms; it is even worse that there are no separate pointer types to disambiguate between GC-ed and explicitly-managed objects. C++ went that route in its first incarnation, and it wasn't long before people realized that it was really hard to keep track of what objects live on the garbage collected heap and what objects are explicitly managed. A new pointer type had to invented (the one denoted by the caret) to solve the problem.

If folks really want to use D to program OS-es and embedded devices and rewrite the code that controls the breaks in their cars, they should at least make a separate D dialect and name it for what it is, Systems D, Embedded D or something like that. The garbage collection and other non-WYSIWYG features should be stripped out from such a dialect.

The ambition of making D 2.0 an all encompassing, moral absolute language may cause 2.0 to never ship, never mind get wide adoption. Perl started out with more modest goals and ended up enjoying a huge mind share.

So the ship-it today if it works hacker in me feels like dedicating a song to the language purists:

We don't need no education
Perl's best language of them all,
We don't need no education
Thanks a bunch to Larry Wall!

Friday, March 20, 2009

Associative Arrays in D.NET

The D Programming Language supports foreach loops over associative arrays.

Associative arrays are data structures that look much like "normal" arrays, but the index types are not integers. Here's an example of an array of doubles indexed by strings, expressed in D:

double [string] a = [ "pi" : 3.14159, "e" : 2.718281828, "phi" : 1.6180339 ];

The D language does not explicitly specify how associative arrays should be implemented. In C++ associative arrays can be implemented as standard STL maps, hash_maps (to be replaced by unordered_maps in C++0x) or with Google's sparse hash maps, to name a few possibilities.

In other languages such as Python, associative arrays are called dictionaries. The family of .NET languages take advantage of the System.Collections.Generic.Dictionary class (this is also what the D compiler for .NET does: it implements associative arrays as system dictionaries).

D provides an easy way to iterate over an associative array, the foreach keyword. This keyword should be familiar to anyone programming in C#, UNIX shell, or managed C++. Here is an example for how it is being used in D:


foreach (string key, double value; a)  {
  version(D_NET) {
    Console.WriteLine("{0}={1}".sys, key, value);
  }
  else {
    writefln("%s=%f", key, value);
  }
}

The types for the key and value arguments can be explicitly specified, but that is not necessary as the compiler can infer them automatically. The foreach line can be re-written more compactly as:


foreach (key, value; a) {

Another legal form for the statement, used to iterate over the values (and ignore the keys) is:


foreach (value; a) {

The current implementation of the compiler front-end synthesizes a nested function out of the loop's body. The .NET back-end constructs a delegate out of this nested function and its closure; then it "wimps out" and calls a run-time helper written in C#:


    public class AssocArray
    {
        public delegate int Callback<V>(ref V value);
        public delegate int Callback<K, V>(K key, ref V value);

        static public void Foreach<K, V>(Dictionary<K, V> aa,
                   int unused,
                   Callback<V> callback)
        /// ...
        static public void Foreach<K, V>(Dictionary<K, V> aa,
                  int unused,
                  Callback<K, V> callback)
        /// ...
      }

The generic Foreach function has two overloads, to accommodate both forms of the foreach statement.

D rules do not allow an array to be modified from within the loop, but the elements of the array can be modified if the value argument has a ref storage class:


foreach (key, ref value; a)  {
    value = 0.0;
}

C#'s rules are stricter, one cannot modify neither the collection (by adding / removing elements) nor change the individual elements. To work around this restriction, the run-time helper code does two passes over the dictionary that corresponds to the D associative array:


static public void Foreach<K, V>(Dictionary<K, V> aa,
       int unused, Callback<K, V> callback)
{
       Dictionary<K, V> changed = new Dictionary<K, V>();
       foreach (KeyValuePair<K, V> kvp in aa)
       {
            V temp = kvp.Value;
            int r = callback(kvp.Key, ref temp);
            if (!kvp.Value.Equals(temp))
            {
                changed[kvp.Key] = temp;
            }
            if (r != 0)
            {
                break;
            }
        }
        foreach (KeyValuePair<K, V> kvp in changed)
        {
            aa[kvp.Key] = kvp.Value;
        }
}

The Callback delegate is constructed from the address of a closure object and a nested foreach function, both synthesized in the compiler. The generated code looks something like this:


  newobj instance void 'vargs.main.Closure_2'::.ctor()
  stloc.s 1 // '$closure3'
  ldloc.1 // '$closure3'
  ldftn instance int32 'vargs.main.Closure_2'::'__foreachbody1' (float64& '__applyArg0')
  // construct Foreach delegate
  newobj instance void class [dnetlib]runtime.AssocArray/Callback`1::.ctor(object, native int)
  .line 14
  call void [dnetlib]runtime.AssocArray::Foreach(
 class [mscorlib]System.Collections.Generic.Dictionary`2,
 int32,
 class [dnetlib]runtime.AssocArray/Callback`1)

Edit: After writing this piece I noticed that I forgot to mention one interesting side effect of my implementation: because there is no try / catch around the Callback call in the C# run-time support code, the foreach loop has all-or-nothing transactional semantics.

For example, this program has different outputs when compiled with DMD from when it is compiled with my D / .NET compiler:


version(D_NET)
{
  import System;
  import dnet;
}
else
{
  import std.stdio;
}

void main()
{
    int [string] x = ["one" : 1, "two" : 2, "three" : 3];

    try
    {
        foreach (ref v; x) 
        {
            if (v == 3)
                throw new Exception("kaboom");
            v = 0;                        
        }
    }
    catch (Exception e)
    {
        version(D_NET)
        {
            Console.WriteLine(e.toString().sys);
        }
        else
        {
            writefln("%s", e.toString());
        }
    }
    foreach (k, v; x) 
    {
        version(D_NET)
        {
            Console.WriteLine("{0}, {1}".sys, k, v);
        }
        else
        {
            writefln("%s, %d", k, v);
        }
    }
}

Under D/.NET it prints:


kaboom
one, 1
two, 2
three, 3

while the native compilation gives:


object.Exception: kaboom
two, 0
three, 3
one, 1

It would be very easy to get my compiler to emulate the native behavior, but I kind of like the "transactional" flavor...

Sunday, March 15, 2009

D Conditional Compilation

The D programming language supports conditional compilation using version identifiers and version numbers, a solution that is slightly better than the #ifdef, pre-processor driven, way of C/C++ that most of us are used to.

When using the .NET compiler for D that I am developing, one will be able to import and take advantage of .NET assemblies. For example the System.Console.WriteLine family of functions may come in handy. But such code would not compile when fed to the native Digital Mars D compiler.

Conditional compilation and the version identifier D_NET do the trick, like in this example:


version(D_NET)
{
 import System;
 import dnet;
}
else
{
 import std.stdio;
}

void main()
{
   int [string] x;

   x["one"] = 1;
   x["two"] = 2;

   foreach (k, v; x)
   {
       version(D_NET)
       {
           Console.WriteLine("{0}, {1}".sys, k, v);
       }
       else
       {
           writefln("%s, %d", k, v);
       }
   }
}

So I hacked the front-end of the D for .NET compiler to predefine D_NET.

Of course, abusing conditional compilation will yield code that is unreadable and hard to grasp as a C++ source littered with #ifdef ... #else ... (or the US tax code).

But I am a strong supporter of The Second Amendment of the Internet Constitution: "the right of the People to keep and bear compilers that let them shoot themselves in the foot shall not be infringed".

Wednesday, March 11, 2009

Strumming Strings In the D Chord

In my recent interview with the InfoQ technology magazine I was asked about compatibility issues between D and .NET. I replied with a brief description of how array slices in D raise a conceptual incompatibility: System.Array and System.ArraySegment are distinct, unrelated types. In D arrays slices are indistinguishable from arrays and this creates the problems that I mentioned in the interview.

But there are other incompatibilities between D and .NET that I did not mention because I wanted to keep the interview lean and focused.

Take for example the built-in support for strings.

The keyword string is shorthand in both IL and C# for System.String (essentially a sequence of Unicode characters in the range U+0000 to U+FFFF).

In the D programming language, string is shorthand for invariant char[] and characters are unsigned bytes.

Side notes: D uses UTF8 to support foreign languages, and also sports the types wstring (a sequence of 2 byte-wide characters, compatible with Microsoft's Unicode) and dstring (for UTF-32 strings). Wide and double-wide (UTF-32) string literals are denoted by the "w" and "d" respective suffixes, as in: "Hello"w, "Good Bye"d (UTF-32 dchar works in the native D compiler, but is not currently supported in my .NET compiler).

When a variable of type string is encountered in a D source, the compiler emits a corresponding IL variable with the type unsigned int8[].

In IL there is a special instruction, ldstr, for loading string literals on the evaluation stack. This code

ldstr "Hello"

loads a "Hello" [mscorlib]System.String. If this literal is to be stored into a variable (say "x"), then my compiler will insert conversion code that looks somewhat like this:


call class [mscorlib]System.Text.Encoding
         [mscorlib]System.Text.Encoding::get_UTF8()
ldstr "Hello"
callvirt instance uint8[]
         [mscorlib]System.Text.Encoding::GetBytes(string)

stloc 'x' // store byte array into variable x

For the cases where a D string (array of bytes) has to be converted to a System.String I provide an explicit string property, called sys, with the following D prototype:


static public String sys(string x);

The D programmer would write something like this:


import System;
// ... snip ...
string x = "Hello";
Console.WriteLine(x.sys);
Console.WriteLine("Hello .NET".sys);

The compiler figures out that in the case of


Console.WriteLine("Hello .NET".sys);

the call to the sys function can be elided, and generates straightforwardly:


ldstr "Hello .NET"
call void [mscorlib]System.Console::'WriteLine' (string)

Matters get more interesting when we consider associative arrays. D offers a great convenience to programmers by supporting associative arrays directly in the language, for example


int [string] dict;

dict["one"] = 1;
dict["two"] = 2;
// ...

introduces an array of integers indexed by strings. By contrast, in other languages such data structures are implemented "externally" in a library; in C++ for example, std::map<std::string, int> is implemented in the STL; the C# equivalent of an associative array is the System.Collections.Generic.Dictionary. My friend and colleague Ionut Gabriel Burete contributed an implementation of associative arrays in the D compiler for .NET using exactly that class.

An associative array / dictionary with string keys is an interesting case, because System.String::Equals does the right thing out of the box, namely performs a lexicographical comparison of two strings; System.Array::Equals however simply compares object references, it does not iterate over and compare elements. This means that a Dictionary(string, int) will behave as you expect, but if the key is of the unsigned int8[] type you may be in for a surprise.

For this reason, I put a hack in the compiler back-end: for the case of associative arrays I break away from representing D strings as byte arrays in .NET, and use System.String instead, which works great (or so I thought up until I ran into the problem of generating the code for a foreach iteration loop).

For D code such as:


import System;
int [string] x;
x["one"] = 1;
x["two"] = 2;
// ...
foreach (k, v; x) {
  Console.WriteLine("{0}, {1}", k, v);
}

the compiler synthesizes a function that corresponds to the body of the loop, and binds the key and value ("k" and "v" in the code snippet) to local variables in that function (this happens all inside the front-end).

The back-end must be able to detect the variables bound to foreach arguments and reconcile the data types where necessary. In the example above the type of the "k" variable in the IL will thus be System.String, and not unsigned int8[].

Sunday, February 22, 2009

Reaching Closure...

A diligent reader commented on my previous post that the implementation of nested functions in D for .NET was buggy for multi-threaded programs.

Indeed, in the code below asyncWork() never returned. That's because a copy by-value of the variable go was used in the closure.


void main()
{
    bool go;
 
    void asyncWork()
    {
        while (!go)
        {
            //busy wait
        }
    }
    Threading.Thread t = new Threading.Thread(&asyncWork);
    t.Start();

    go = true;
}

I am trying to avoid generating unverifiable code in this compiler project: IL does not allow managed pointers as fields in a class; ILASM accepts unmanaged pointers but that yields unverifiable code. My first instinct for closures was to use a copy for all the referenced variables, but as observed by my reader that approach did not work for multi-threaded programs.

One way to solve the problem is to use unmanaged pointers in the closure; under this implementation the example above runs correctly. There may be at least another solution: wrap each referenced variable into an object, and have both the nested function and the surrounding context share the object reference; I found it to convoluted and pursued the unmanaged pointers route instead.

This is how the generated IL looks for the D code in the example:


.module 'example'
.custom instance void [mscorlib]System.Security.UnverifiableCodeAttribute::.ctor() = ( 01 00 00 00 )


//--------------------------------------------------------------
// main program
//--------------------------------------------------------------
.method public hidebysig static void _Dmain ()
{
  .entrypoint
  .maxstack 3
  .locals init (
    [0] bool pinned 'go',
    [1] class [mscorlib]System.Threading.Thread 't',
    [2] class example.main.closure1 '$closure3'
  )
  newobj instance void example.main.closure1::.ctor()
  stloc.s 2 // '$closure3'
  ldloc.2 // '$closure3'
  ldloca 0 // 'go'
  stfld bool* 'example.main.closure1'::go2
  ldloc.2 // '$closure3'
  dup
  ldvirtftn instance void example.main.closure1::'asyncWork' ()
  newobj instance void class [mscorlib]System.Threading.ThreadStart::.ctor(object, native int)
  newobj instance void [mscorlib]System.Threading.Thread::.ctor (ThreadStart)
  stloc.s 1 // 't'
  ldloc.1 // 't'
  callvirt instance void [mscorlib]System.Threading.Thread::'Start' ()
  ldc.i4 1
  stloc.s 0 // 'go'
  ret
}

.class private auto example.main.closure1 extends [dnetlib]core.Object
{

  .method public virtual newslot hidebysig instance void 'asyncWork' ()
  {
    .maxstack 2
L1_example:
    ldarg.0
    ldfld bool* 'example.main.closure1'::go2
    ldind.i1
    ldc.i4 0
    beq L1_example
    ret
  }
  .field public bool* go2
  // default ctor, compiler-generated
  .method public hidebysig instance void .ctor()
  {
    ldarg.0
    call instance void [dnetlib]core.Object::.ctor()
    ret
  }
} // end of example.main.closure1

Now there is only one more small problem to address: synchronization between threads...

Wednesday, February 11, 2009

Nested Functions and Delegates

My previous post missed one aspect of delegates in D: nested functions. Walter Bright gave me this example:


int delegate() foo(int i)
{
  int bar() { return i; }
  return &bar;
}

Function bar is nested inside foo; foo wraps bar into a delegate which is returned. My blog post is guilty of overlooking this use case for delegates; yet my compiler implementation is innocent: the example compiles and runs correctly.

The code example may look like a new use case at first, but is in fact similar to making a delegate from an object instance and a method:


class Foo {
    int i;
    int bar() { return i; }
}
...
Foo f = new Foo;
int delegate() dg = &f.bar;

The reason is that there is an invisible object in the nested function case. In the D programming language, nested functions have access to the surrounding lexical scope (note how function bar uses i which is declared as a parameter of foo); the .NET D compiler represents internally the lexical context of the nested function as an object. The fields of the context object are shallow copies of the variables in the "parent" scope. The IL class declaration for the context is synthesized by the compiler, which also instantiates the context. The context is populated on the way in (before calling the nested function) and used to update the variables in the parent scope on the way out (after the call has completed).

The constructor of a delegate object takes two parameters: an object reference and a pointer to a function; in the case of nested functions, the first parameter that the compiler passes under the hood is the context object. This is why constructing a delegate from a nested function is not different from using an object and one of its methods.

What if the nested function is declared inside of a class method (you ask). In this case there is no need to synthesize a class declaration to model the context of the nested call. The class to which the method belongs is augmented with hidden fields that shadow the variables in the parent scope.

Tuesday, February 10, 2009

Delegates in D for .NET

This past weekend I typed "Joe Newman" in Pandora and sat down for a couple of hours to implement delegates in my .NET back-end for the D compiler.

I begun by studying the documentation on MSDN and I noticed some differences in the way delegates work in .NET and D.

In .NET (and C#) delegates are objects that wrap pointers to functions so that they can be manipulated and invoked safely. The functions may be either standalone or members of a class. In D, the concept of delegates applies only to member functions. Delegates may be called asynchronously in .NET (I am not aware of a similar feature in the D programming language). The concept of delegates is thus simpler in D.

The implementation that I came up with is straight-forward: classes that derive from [mscorlib]System.MulticastDelegate are generated for each delegate type. The classes are sealed and each have a virtual Invoke method that matches the signature of the D delegate.

For the following D code snippet


class Test
{
    void fun(int i)
    { ...
      ...
    }
}
Test t = new Test;
void delegate(int) dg = &t.fun;

the generated IL looks like this:


.class sealed $Delegate_1 extends [mscorlib]System.MulticastDelegate
{
  .method public instance void .ctor(object, native int) runtime managed {}
  .method public virtual void Invoke(int32) runtime managed {}
}
...
...
.locals init (
    [0] class Test 't',
    [1] class $Delegate_1 'dg'
 )
newobj instance void delegate.Test::.ctor ()
stloc.s 0 // 't'

ldloc.0 // 't'
dup
ldvirtftn instance void delegate.Test::'print' (int32 'i')
newobj instance void class $Delegate_1::.ctor(object, native int)
stloc.1

One small (and annoying) surprise that I had was that although the IL standard contains code samples with user-defined classes derived directly from [mscorlib]System.Delegate, such code did not pass PEVERIFY and, more tragically, crashed at run-time. The error message ("Unable to resolve token", or something like that) was not helpful; but the ngen utility dispelled the confusion by stating bluntly that my class could not inherit System.Delegate directly. Replacing System.Delegate with System.MulticastDelegate closed the issue.

Once I got delegates to work for class methods, I realized that the code can be reused to support D pointers to functions as well. In D pointers to functions are a different concept from delegates; in .NET however, a delegate can be constructed from a standalone function by simply passing a null for the object in the constructor. It is trivial for the compiler to generate code that instantiates .NET delegates in lieu of function pointers.

One nice side-effect of representing pointers to functions as delegates is that they can be aggregated as class members, unlike pointers to other data types that cannot be aggregated as struct or class fields (an IL-imposed restriction for managed pointers).

I hope that one day D decides to support asynchronous delegate calls. I have yet to imagine the possibilities for asynchronous, pure methods.

Until then, the .NET back-end is moving along getting closer and closer to a public release.

Sunday, February 08, 2009

D-elegating Constructors

The D programming language allows a constructor of a class to call another constructor of the same class, for the purpose of sharing initialization code. This feature is called "delegating constructors"; it is also present in C# and in the emerging C++ 0x.

C#'s syntax for delegating constructors resembles the initializer lists in C++, and strictly enforces that the delegated constructor is called before any other code in the body of the caller constructor; the feature is masterfully explained in More Effective C#: 50 Specific Ways to Improve Your C# (Effective Software Development Series).

D is more flexible, a constructor can be called from another constructor's body pretty much like any other "regular" method, provided that some simple rules are observed (for example, it is not permitted to call a constructor from within a loop).

A D compiler must detect constructor delegation and ensure that some initialization code is not executed more than once. Let's consider an example:


class Example
{
    int foo = 42;
    int bar;

    this()
    { 
        bar = 13; 
    }
    this(int i)
    {
        foo = i;
        this();
    }
}

In the first constructor, before the field bar is assigned the value 13, some "invisible" code executes: first, the constructor of the base class is invoked. The Example class does not have an explicit base; but in D, similar to Java and C#, all classes have an implicit root Object base. It is as if we wrote:


class Example : Object
{ ...
}

After generating the call to Object's constructor, the compiler generates the code that initializes foo to 42. The explicit assignment as written by the programmer executes after wards.

The compiler must be careful so that the initializations steps described above happen only once in the second constructor. This is not simply a matter of efficiency; it is more importantly, a matter of correctness. If calling the base Object constructor and the initialization of foo where generated blindly inside the body of each constructor, then the following would happen in the second constructor's case:

Object's ctor is invoked (compiler generated)

foo = 42 (compiler generated)

foo = i (programmer's code)

constructor delegation occurs (programmer's code), which means that:

Object's ctor is invoked

foo = 42 (compiler generated)

This is obviously incorrect, since it leaves the Example object in a different state than the programmer intended.

Such scenario is very easily avoided by a native compiler. Object creation is translated to several distinct steps:

memory for the object is allocated

invocation of base ctor is generated

initializers are generated (this is where foo = 42 happens)

constructor as written by programmer is invoked

The important thing to note is that in the native compiler's case the compiler leaves the constructors alone, as written by the programmer, and inserts its magic "pre-initializaton" steps in between the memory allocation and constructor invocation.

When writing a compiler back-end for .NET things are slightly different: the creation of an object is expressed in one compact, single line of MSIL (Microsoft Intermediary Language) assembly code:


newobj <constructor call>

In our example, that would be


newobj void class Example::.ctor()

and


newobj void class Example::.ctor(int32)

respectively. So the compiler-generated magic steps of calling the base constructor, etc have to happen inside the constructor body. To prevent the erroneous scenario of double-initialization from happening, I had to generate a hidden, "guard" Boolean field for classes that use constructor delegation. The variable is set when entering a constructor's body; it is checked inside each constructor before calling the base constructor and stuff. Here's how the generated IL code looks like:


//--------------------------------------------------------------
// ctor.d compiled: Sun Feb 08 23:04:49 2009
//--------------------------------------------------------------
.assembly extern mscorlib {}
.assembly extern dnetlib {}
.assembly 'ctor' {}

.module 'ctor'


.class public auto ctor.Example extends [dnetlib]core.Object
{
  .field public int32 foo
  .field public int32 bar
  .method public hidebysig instance void .ctor ()
  {
    .maxstack 3
    ldarg.0
    ldfld bool 'ctor.Example'::$in_ctor
    brtrue L0_ctor
    ldarg.0
    call instance void [dnetlib]core.Object::.ctor()
    ldarg.0
    ldc.i4 42
    stfld int32 'ctor.Example'::foo
L0_ctor:
    ldarg.0 // 'this'
    ldc.i4 13
    stfld int32 'ctor.Example'::bar
    ret
  }
  .method public hidebysig instance void .ctor (int32 'i')
  {
    .maxstack 3
    ldarg.0
    call instance void [dnetlib]core.Object::.ctor()
    ldarg.0
    ldc.i4 42
    stfld int32 'ctor.Example'::foo
    ldarg.0 // 'this'
    ldarg.1 // 'i'
    stfld int32 'ctor.Example'::foo
    ldarg.0 // 'this'
    ldc.i4 1
    stfld bool 'ctor.Example'::$in_ctor
    ldarg.0
    call instance void ctor.Example::.ctor ()
    ret
  }
  .field bool $in_ctor
} // end of ctor.Example

As a side note, in the second constructor's case a small redundancy still exists: foo is assigned to 42 only to be set to another value right away. I am hoping that this isn't much of an issue if the JIT engine detects it and optimizes it out. I'd be happy to hear any informed opinions.

Sunday, February 01, 2009

Stepping Over STL Code

When debugging C++ code written using the Standard Template Library (STL) it is not unusual to find yourself stepping through STL code. Most of the time, this is not very useful: The STL implementation typically comes bundled with the C++ compiler, and it has been thoroughly tested by the vendor; it is unlikely that the bug you are after is caused by the STL.

So when a statement such as myVector.push_back(x) is encountered while tracing with the debugger, you normally want to step over it, not into it. Most debuggers offer a "step over" and a "step into" function. So you would chose "step over".

But how about this? You want to debug a routine named my_func(size_t containerSize) and want to step into the body of my_func when this statement is hit: my_func(myVector.size()). If you select "step into", the debugger will first take you into the guts of STL's vector<T>::size() implementation before stepping into my_func.

The ZeroBUGS debugger allows you to avoid such annoyances. Once inside size(), you can right click, and select to "Always step over..." that function, all functions in that file, or all files in the same directory. The debugger will remember your option, and you don't have to see the guts of size(), or any other vector function, or any other STL function, respectively.

The functionality can be used not just with the STL but any code. If you later change your mind, the "Manage" menu allows you to remove functions, files or directories from the step-into blacklist.

To Destruct a D Struct

I wrote a while ago about similarities between D and .NET (and implicitly C#). My interest in mapping D features to .NET is driven by a research project that I took on a few months ago: a D 2.0 language compiler for .NET (D 2.0 is a branch version of D that includes experimental features). I was mentioning how in both D and C# structs are lightweight, value types.

After working on struct support in more detail, I have come to the realization that D structs cannot be implemented as .NET value type classes. Rather, they have to be implemented as reference type classes.

The short explanation is that while in IL value classes do not participate in garbage collection, D expects the GC to reap structs after they are no longer in use.

Interestingly enough, value types may be newobj-ed (not just created on the stack).

We can use a simple example to demonstrate the difference between value classes and reference classes. If we compile the following program using the IL assembler (ILASM) and run it, nothing gets printed on the screen:


.assembly extern mscorlib {}
.assembly 'test' {}

.class public value auto Test
{
  .field public int32 i

  .method public void .ctor()
  {
      ret
  }
  .method virtual family void Finalize()
  {
    ldstr "finalizing..."
    call void [mscorlib]System.Console::WriteLine(string)
    ret
  }
}
//--------------------------------------------------------------
// main program
//--------------------------------------------------------------
.method public static void main ()
{
  .entrypoint
  .locals init (
    class Test t
  )
  newobj instance void Test::.ctor()
  stloc 't'
  ret
}

But if we changed the declaration of the Test class from a value type to class, like this:


.class public auto Test

we could see "finalizing..." printed, a confirmation that the destructor (the Finalize method) is being invoked by the garbage collector. All it takes is removing "value" from the declaration.

In IL, value types have no self-describing type information attached. I suspect that the reason for not having them being garbage collected is that, without type information, the system cannot possibly know which (virtual) Finalize method to call (note that although C# struct are implemented as sealed value classes, "sealed" and "value" are orthogonal).

D supports the contract programming paradigm, and class invariants is one of its core concepts.

The idea is that the user can write a special method named "invariant", which tests that certain properties of a class or struct hold. In debug mode, the D compiler inserts "probing points" throughout the lifetime of the class (or struct), ensuring that this function is automatically called: after construction, before and after execution of public methods, and before destruction.

The natural mechanism for implementing the last statement is to generate a call to the invariant method at the top the destructor function body. But if the destructor is never called then we've got a problem.

So having destructors work correctly is not just a matter of collecting memory after the struct expires, but it is also crucial to contract programming in D.

Assignment to structs and passing in and to functions may become heavier weight in D.NET than in the native, Digital Mars D compiler (albeit this is something that I have to measure) by implementing structs as reference type classes, but it is necessary in order to support important D language features.

Thursday, January 22, 2009

42

Yesterday night, at the monthly NWCPP meeting Walter Bright gave a presentation on meta-programming using the D language. Once again, D put C++ to shame.

Because of transportation arrangements I could not accompany Walter et. Co to the watering hole after the lecture. Instead I went home and decided to test how my D.NET work-in-progress compiler handles templates, and what kind of code it generates.

I picked a variadic template for my test, which computes the maximum of an arbitrarily long list of numbers (adapted from a version written by Andrei Alexandrescu) :

import System;

auto max(T1, T2, Tail...)(T1 first, T2 second, Tail args)
{
 auto r = second > first ? second : first;
 static if (Tail.length == 0) {
   return r;
 }
 else {
   return max(r, args);
 }
}

void main()
{
  uint k = 42;
  auto i = max(3, 2, k, 2.5);
  Console.WriteLine(i);
}

The program above prints 42 (of course), and here's how the generated IL looks like:


//--------------------------------------------------------------
// max.d compiled: Thu Jan 22 19:38:26 2009
//--------------------------------------------------------------
.assembly extern mscorlib {}
.assembly extern dnetlib {}
.assembly 'max' {}

.module 'max'

//--------------------------------------------------------------
// main program
//--------------------------------------------------------------
.method public hidebysig static void _Dmain ()
{
.entrypoint
.maxstack 4
.locals init (
[0] unsigned int32 'k',
[1] float64 'i'
)
ldc.i4 42
stloc.s 0    // 'k'
ldc.i4 3
ldc.i4 2
ldloc.0    // 'k'
ldc.r8 2.5
call float64 _D3max16__T3maxTiTiTkTdZ3maxFiikdZd (
  int32 'first', int32 'second', unsigned int32, float64)
stloc.s 1    // 'i'
ldloc.1    // 'i'
call void [mscorlib]System.Console::'WriteLine' (float64)
ret
}
.method public hidebysig static float64 _D3max16__T3maxTiTiTkTdZ3maxFiikdZd (
 int32 'first', int32 'second', unsigned int32, float64)
{
.maxstack 4
.locals init (
[0] int32 'r'
)
ldarg.1    // 'second'
ldarg.0    // 'first'
bgt L0_max
ldarg.0    // 'first'
br L1_max
L0_max:
ldarg.1    // 'second'
L1_max:
stloc.s 0    // 'r'
ldloc.0    // 'r'
ldarg.2    // '_args_field_0'
ldarg.3    // '_args_field_1'
call float64 _D3max14__T3maxTiTkTdZ3maxFikdZd (
int32 'first', unsigned int32 'second', float64)
ret
}
.method public hidebysig static float64 _D3max14__T3maxTiTkTdZ3maxFikdZd (
 int32 'first', unsigned int32 'second', float64)
{
.maxstack 3
.locals init (
[0] unsigned int32 'r'
)
ldarg.1    // 'second'
ldarg.0    // 'first'
conv.u4
bgt L2_max
ldarg.0    // 'first'
conv.u4
br L3_max
L2_max:
ldarg.1    // 'second'
L3_max:
stloc.s 0    // 'r'
ldloc.0    // 'r'
ldarg.2    // '_args_field_0'
call float64 _D3max12__T3maxTkTdZ3maxFkdZd (unsigned int32 'first', float64 'second')
ret
}
.method public hidebysig static float64 _D3max12__T3maxTkTdZ3maxFkdZd (
 unsigned int32 'first', float64 'second')
{
.maxstack 2
.locals init (
[0] float64 'r'
)
ldarg.1    // 'second'
ldarg.0    // 'first'
conv.r8
bgt L4_max
ldarg.0    // 'first'
conv.r8
br L5_max
L4_max:
ldarg.1    // 'second'
L5_max:
stloc.s 0    // 'r'
ldloc.0    // 'r'
ret
}

Edit: One more reason for loving D templates: pasting D code into HTML does not require replacing angular brackets with < and > respectively!

Friday, January 16, 2009

Deeper and D-per

I am very excited about the D for .NET compiler project because of all the things I am learning. Books on my lab's desk these days are: Compiling for the .NET Common Language Runtime (CLR), Distributed Virtual Machines: Inside the Rotor CLI, Concepts of Programming Languages (8th Edition), The Dragon Book.

As I am digging deeper and deeper I am discovering interesting design challenges.

1. Enum

In .NET, the base class for enumerated types is System.Enum, which does not allow non-integral- (nor char-) based types.

D allows strings in enumerated types, like so:

enum : string
{
   A = "hello",
   B = "betty"
}

Possible solutions: make D.NET's enums based on [mscorlib]System.Enum and forbid non integrals (current implementation), or make my own [dnetlib]core.Enum base class. The problem with the latter is that it would preclude inter-operation with other languages. Walter Bright's suggestion (it is so clever that I wish I came up with it myself) is to use a combination of both solutions: generate structures derived from System.Enum when possible, and base them off a custom class when not. The D .NET compiler will split enums into two groups - those that are integral types, and those that are not, plus the "char" case. This allows interoperability without crippling language features.

2. Strings

D strings are UTF8, System.String is Unicode 16. My attempt to cleverly use System.String under the hood (rather than represent D strings as byte arrays) created way more complications than I initially thought, and prompted a major re-factoring effort that ate up almost my entire week-end.

An interesting wrinkle is that in D.NET associative arrays are implemented using Dictionary objects under the hood, which use the Equals method to compare keys. For System.String, Equals does a lexicographical comparison as one would normally expect; for System.Array, the implementation of Equals simply compares object references.

D strings are now represented as arrays of bytes; in order to make associative arrays work correctly, extra work had to be done.

3. Pointers

Can't do pointers as members of classes or elements in an array.

This stems from a restriction in the IL: cannot have managed pointers as class fields or array elements.

Interesting consequences:
3.a) in D.NET a nested method cannot access variables of pointer type in the surrounding lexical context (because the implementation constructs a delegate under the hood, and the object part has all the accessed variables copied as its members)

3.b) can't pass pointers to variadic functions (because I send in the variable argument list in as an array -- for compatibility with System.Console.WriteLine)

How severe are these limitations? Are there any reasonable workarounds? I guess I'll have to dig D-per to find out.