Sunday, February 22, 2009

Reaching Closure...

A diligent reader commented on my previous post that the implementation of nested functions in D for .NET was buggy for multi-threaded programs.

Indeed, in the code below asyncWork() never returned. That's because a copy by-value of the variable go was used in the closure.

void main()
{
bool go;

void asyncWork()
{
while (!go)
{
//busy wait
}
}
Threading.Thread t = new Threading.Thread(&asyncWork);
t.Start();

go = true;
}

I am trying to avoid generating unverifiable code in this compiler project: IL does not allow managed pointers as fields in a class; ILASM accepts unmanaged pointers but that yields unverifiable code. My first instinct for closures was to use a copy for all the referenced variables, but as observed by my reader that approach did not work for multi-threaded programs.

One way to solve the problem is to use unmanaged pointers in the closure; under this implementation the example above runs correctly. There may be at least another solution: wrap each referenced variable into an object, and have both the nested function and the surrounding context share the object reference; I found it to convoluted and pursued the unmanaged pointers route instead.

This is how the generated IL looks for the D code in the example:

.module 'example'
.custom instance void [mscorlib]System.Security.UnverifiableCodeAttribute::.ctor() = ( 01 00 00 00 )


//--------------------------------------------------------------
// main program
//--------------------------------------------------------------
.method public hidebysig static void _Dmain ()
{
.entrypoint
.maxstack 3
.locals init (
[0] bool pinned 'go',
[1] class [mscorlib]System.Threading.Thread 't',
[2] class example.main.closure1 '$closure3'
)
newobj instance void example.main.closure1::.ctor()
stloc.s 2 // '$closure3'
ldloc.2 // '$closure3'
ldloca 0 // 'go'
stfld bool* 'example.main.closure1'::go2
ldloc.2 // '$closure3'
dup
ldvirtftn instance void example.main.closure1::'asyncWork' ()
newobj instance void class [mscorlib]System.Threading.ThreadStart::.ctor(object, native int)
newobj instance void [mscorlib]System.Threading.Thread::.ctor (ThreadStart)
stloc.s 1 // 't'
ldloc.1 // 't'
callvirt instance void [mscorlib]System.Threading.Thread::'Start' ()
ldc.i4 1
stloc.s 0 // 'go'
ret
}

.class private auto example.main.closure1 extends [dnetlib]core.Object
{

.method public virtual newslot hidebysig instance void 'asyncWork' ()
{
.maxstack 2
L1_example:
ldarg.0
ldfld bool* 'example.main.closure1'::go2
ldind.i1
ldc.i4 0
beq L1_example
ret
}
.field public bool* go2
// default ctor, compiler-generated
.method public hidebysig instance void .ctor()
{
ldarg.0
call instance void [dnetlib]core.Object::.ctor()
ret
}
} // end of example.main.closure1

Now there is only one more small problem to address: synchronization between threads...

3 comments:

Jonathan Allen said...

> There may be at least another solution: wrap each referenced variable into an object, and have both the nested function and the surrounding context share the object reference;

That is how C# and VB do it. So far it hasn't caused me too much pain.

The Free Meme said...

Jonathan, are you referring to "boxing"? That's not what I had in mind...

Krystal said...

No, it's not boxing.
C# has closures too, in the form of anonymous delegats since C# 2.0, and in the form of lambda expressions since C# 3.0. C#'s closure is similar to those in other languages, that it captures enclosing block's environment.

C# compiler implements closures in a few ways, determined by escape analysis.
The compiler checks if the closure captures "this" (that is to say, if it uses any member fields), and/or if it captures any local variables of the enclosing scope.

If the closure captures neither of the two, or to say if all variables used within the closure are bound, then the compiler emits a private static method.
If it captures "this" but no local variables of the enclosing scope, then the compiler emits a private member method.
If it captures any local variables of the enclosing scope, then the compiler builds a compiler-generated nested class (with the name of something like "<>__CompilerGenerated1" to ensure no name collusion is possible). The anything the closure captures gets stored in an instance of the generated class, and the nested function becomes a method of the generated class.