Programming and Debugging (in my Underhøøsen): 2008

Tuesday, December 30, 2008

Jagging Away

The D programming language does not support multi-dimensional arrays.

Instead, multi-dimensional matrices can be implemented with arrays of arrays (aka jagged arrays), same as in C and C++.

When a static, multidimensional array needs to be initialized, in a statement such as:


int foo[3][4][5][6];

the native compiler back-end implicitly initializes the array by reserving the memory and filling it with zeros.

In the .NET back-end for the D compiler that I am working on, things are different: explicit newarr calls are required, in conjunction with navigating the data structure and initializing the individual elements.

And this is where it gets interesting. The array may have any arbitrary rank, and thus the compiler needs to figure out the types of the nested arrays; for the example above, they are:


int32 [][][][]
int32 [][][]
int32 [][]
int32 []

My implementation uses a runtime helper function in the dnetlib.dll assembly; rather than trying to determine the rank of the array and the types involved, the compiler back-end simply generates a call to the runtime helper, which does the heavy lifting. This solution works for jagged arrays of any rank.

The helper code itself is written in C# and uses generic recursion; it appends square brackets [] to the generic parameter at each recursion level, like shown below.

namespace runtime
{
 public class Array
 {
   //helper for initalizing jagged array
   static public void Init<T>(System.Array a, uint[] sizes, int start, int length)
   {
     if (length == 2)
     {
        uint n = sizes[start];
        uint m = sizes[start + 1];
        for (uint i = 0; i != n; ++i)
        {
            a.SetValue(new T[m], i);
        }
     }
     else
     {
       --length;
       //call recursively, changing template parameter from T to T[]
       Init<T[]>(a, sizes, start, length);
       uint n = sizes[start];
       for (uint i = 0; i != n; ++i)
       {
           Init<T>((System.Array)a.GetValue(i), sizes, start + 1, length);
       }
     }
   }

   //called at runtime
   static public void Init<T>(System.Array a, uint[] sizes)
   {
      Init<T>(a, sizes, 0, sizes.Length);
   }
 }
} //namespace runtime

Sunday, December 28, 2008

Is There A Point in Using Pointers?

A few people wrote back in response to a previous blog post on the D for .NET project, some asking, well, why .NET?

Part of the answer is that .NET and D seem to be made for each other:

A common fragrance imbues both designs; for example, in D structs are not objects, but value types -- same as in C#. In D all objects inherit from a root object, which has methods such as toString , toHash and opEquals; in .NET, [mscorlib]System.Object sports ToString, GetHashCode, and Equals.

Still not convinced? How about array properties, then? In D there are properties such as sort, reverse, and dup; in .NET we have System.Array.Sort(), System.Array.Reverse(), and (tadaaa) System.Array.Clone(). Coincidence? Perhaps. Or maybe powerful memes where floating free in the air and found propitious hosts in both .NET and D (not unlike the idea of Python-scripting a debugger, which was pioneered by ZeroBUGS, and it is now being adopted by GDB).

But the cute metaphors have to stop somewhere (no honeymoon lasts forever) and so we come upon the thorny issue of pointers. D allows pointers, albeit does not encourage them. But unmanaged pointers (and even managed pointers arithmetic) does not yield verifiable code in .NET. I have experimented with both managed and unmanaged pointers, and generated textual IL that compiles and runs; PEVERIFY however refuses to put the seal of approval on such code.

And so I am very tempted to disallow pointers in class and struct members (in D, as in .NET objects are manipulated via references anyway, so what's the point of a pointer, anyway?)

A Good Idea is Worth Stealing

As always, the freetards in the Open Source community are stealing good ideas. Scripting the debugger with Python, pioneered by my work in ZeroBUGS is now copied by GDB: http://sourceware.org/ml/gdb/2008-02/msg00140.html

Too bad their implementation is awfully buggy.

And too bad that back in 2006 I did not think the idea was patentable :)

Sunday, December 21, 2008

Hello .NET, D Here Calling

A piece of advice from someone who spent fifteen years writing software professionally: if some "experts" ever say "printf debugging" is a poor technique, tell them to get out of town.

Printf debugging is helpful in a many great deal of situations, for example when you are writing a compiler. The debugger cannot be trusted, because the work-in-progress compiler may not output complete debug information just yet. But you can trust what's printed white on black on the screen.

There is a chicken and egg problem with printf though: how does one compile the implementation of printf (or writefln, as it is the case with the D programming language) if the compiler itself is not there?

Luckily, in my implementation of D for .NET (a project that I plan to release under the BSD in 2009) I was able to completely circumvent the problem. See, .NET already has a rich set of libraries and services. As a matter of fact, the very purpose of D.NET is to enable D programmers to take advantage of this state-of-the-art computing platform.

The obvious choice is to use one of the System.Console.WriteLine overloads instead of writefln. In order to do that, the front-end needs to know about System.Console. The good news is that Walter Bright's implementation of the D compiler front-end (which I am using) already handles imports.

In order to get this code to work


import System;

void main() {
  System.Console.WriteLine("hello D.NET");
}

I wrote a System.d file, containing the D version of the Console class declaration (not complete, but good enough to get me going):


public class Console
{
   static public void WriteLine();
   static public void WriteLine(string);
   static public void WriteLine(string, ...);
 
   static public void WriteLine(char);
   static public void WriteLine(bool);
   static public void WriteLine(int);
   static public void WriteLine(uint);
   static public void WriteLine(long);
   static public void WriteLine(float);
   static public void WriteLine(double);
 
   static public void Write(char);
   static public void Write(string);
   static public void Write(string, ...);
 
   static public void Write(bool);
   static public void Write(int);  
   static public void Write(long);
   static public void Write(float);
   static public void Write(double);
}

In the future, I plan to write a program that produces this kind of declaration automatically from .NET assemblies, using reflection.

Okay, so at this point the front-end compiles the "hello D.NET" code happily, but ILASM cannot resolve the System.Console::WriteLine symbol. This is because in IL the class names need to be qualified by assembly name, so what ILASM expects is a statement like:


call void class [mscorlib]System.Console::WriteLine(string)

One of my design guide lines for this project is to modify the front-end as least as possible, if at all. So then how do I get the imports to be fully qualified by assembly names?

With a clever hack, of course. I added these lines at the top of System.d:


class mscorlib { }
class assembly : mscorlib { }

Then I tweaked my back-end to recognize the "class assembly" construct and prefix the imported module names with whatever the name of the base class of the assembly class is.

Sunday, December 07, 2008

Dee Dot

It does not seem that long ago when I thought .NET and C# were mere toys compared to "real" languages such as C++. Oh well. One always lives to learn!

A couple of months ago I was tasked (at my day job) with making a legacy piece of C++ business logic inter-operate with another team's C# code. It was a small, one week project, but it forced me to pick up some .NET stuff along the way. Long story short, I was so intrigued with the maturity and elegance of the technology that I decided to push the exploration further, on my own time.

A friend and colleague of mine expressed interest in the past in writing a compiler back-end for the D language, that would generate .NET assemblies. So we met and re-discussed the project and decided to give it a go. The D language features map onto .NET quite well: there's no multiple inheritance of implementation, but one class can implement any number of interfaces; structs are value types, the D language is garbage-collected with provisions for explicit destructors (just implement IDisposable), and there are static constructors (.cctors in .NET).

D has static destructors also, which I found really easy to implement by generating code that registers a ProcessExit handler, like this:


// register static dtor as ProcessExit event handler
call class [mscorlib]System.AppDomain [mscorlib]System.AppDomain::get_CurrentDomain()
ldnull
ldftn void dtor.Base::'_staticDtor1'(object, class [mscorlib]System.EventArgs)
newobj instance void [mscorlib]System.EventHandler::.ctor(object, native int)
callvirt instance void [mscorlib]System.AppDomain::add_ProcessExit(class [mscorlib]System.EventHandler)
ret

Our approach so far is to generate IL assembly code, and then invoke ILASM. This way we are portable (the code so far works on Windows as well as with Novell's mono) and easy to debug.

We got some basic flow control working, and exception handling and assertions are also supported. But there's a long way to a fully-working compiler.

We intend to make the project open source (as soon as we decide upon the license -- I kind of hate the GPL for being anti-business and anti-profit, but the jury is still out on this one).

I think this is not just a great opportunity to bring the D language to the .NET family's table, but also give D programmers access to the wealth of libraries and utilities written for .NET to date.

Wednesday, September 24, 2008

Symbol Caches

OK, here I am back from a blogging sabbatical (summer in Seattle is so precious that it is sacrilegious not to spend it outside).

And the past few months have been quite eventful. I thought that everything out here in the blogosphere was going to be overshadowed by Real Life (kind of like Michael Phelps steeling the center stage away from Liam Tancock, a guy with a lot of bronze under his belt). Watching the news was a bit of a distraction I must admit...

But I spent the few rainy summer nights working on a ZeroBUGS new exciting feature: caching symbol tables and other debug information from one session to another.

This allows the debugger to start up faster, and use lesser memory (since the cached info is already indexed on disk, there is no need to build maps and hashes in memory).

I have deprecated an older implementation for caching symbol tables, that was based on memory-mapped files. The new code is built upon SQLite, and ensures data integrity with triggers that enforce foreign key constraints.

For the sake of stability, the commercial release does not feature the SQLite-based caches yet. The plan is to slowly burn in the new code through the beta builds, and have it officially released later this fall.

Thursday, July 10, 2008

RTFM

A number of people wrote to complain about ZeroBUGS not working properly, for example not stopping at main().

Guys, have you heard the phrase "the user is always wrong"? RTFM!

This is a known issue: http://www.zero-bugs.com/2.0/known_issues.html

And see:
http://www.granneman.com/blog/2006/06/13/how-virtual-machines-work/
and http://www.softpanorama.org/VM/index.shtml

as well as the wikipedia entry on VMware. You will find that "One must always rewrite; performing a simulation of the current program counter in the original location when necessary and (notably) remapping hardware code breakpoints".

Wednesday, June 25, 2008

Remote Attachment

In addition to being able to execute programs on a remote system from within the ZeroBUGS debugger, it is also possible to attach to remote processes.

When the remote proxy plug-in is detected, the UI automatically adds a "Target Parameters" field at the bottom of the "Attach to Process" dialog box.

After you type in the protocol ("remote://") followed by either the name or the IP address of the target system, and press the "Refresh" button the box will be populated with a list of processes running on the remote machine. Just select the desired process and click "Ok".

Sunday, June 22, 2008

May I Have the Remote, Please?

This past weekend I put the final touch on supporting remote debugging in my favorite C/C++ debugger for Linux, ZeroBUGS.

It is under test and will ship in the commercial version within the next five to ten days.

Currently, the remote debugger is not a cross-debugger, that is to say it only works between systems of the same architecture (adding support for cross-debugging is not terribly hard, I just do not want to spend rare Seattle summer days working on a feature, unless users ask for it).

"Well, then what's the point in supporting remote debugging", some may ask. "Can't we just ssh -X into the remote target and run the debugger there?"

Installing a full-fledged debugger, with a UI-module that depends heavily on Gnome may not be a feasible option for small (possibly embedded) systems where resources are scarce.

My solution for remote debugging is to install a thin, lightweight server on the target system, and have the debugger on a Linux workstation do the heavy lifting of building symbol tables, managing breakpoints, and so on.

In order to read debug information, the debugger needs access to the executable and shared objects on the target system. One possible solution is to copy them over to the workstation where the debugger runs, but it is not very practical. The amount of files to be copied can quickly go out of control, by ways of shared library dependencies.

I found that it is simpler to just mount the remote target onto the machine where the debugger runs. SSHFS is ideal for this job. Because the debug info may contain references to absolute path, ZeroBUGS provides the ZERO_REMOTE_PATH environmental variable, which creates an internal mapping between the mount point and the original paths.

Here are the steps for a remote debugging sessions, with examples from my own lab. The debugger runs on a 64 bit system (zulu) running Ubuntu 7.10, and the target executable(s) reside on another 64 bit machine (arnold) running Ubuntu 8.04.

1) Mount the filesystem of the target computer onto the debugger system. Example:


cristiv@zulu:~/workspace/sandbox$ sudo chown cristiv /dev/fuse
[sudo] password for cristiv:
cristiv@zulu:~/workspace/sandbox$ sshfs root@10.0.1.10:/ ~/workspace/remote/
root@10.0.1.10's password:
cristiv@zulu:~/workspace/sandbox$

2) Add mount point to remote map. Example:


export ZERO_REMOTE_MAP="10.0.1.10:/home/cristiv/workspace/remote;"

Note that each entry must be ended with a semicolon (even when there is only one entry in the map).

3) Start the ZeroBUGS server on the remote (debug target) system:


cristiv@arnold:~/workspace/zero$ zserver
*** ZeroBUGS Remote Debug Server V. 1.0 ***
*** Copyright (c) 2008 Zero Systems LLC ***
cristiv@arnold:~/workspace/zero$

To debug remotely programs running on host 10.0.1.10:
4) Run a remote program and debug it, using the command line:


zero remote://10.0.1.10/home/cristiv/workspace/zero/a.out

The UI can also be used to execute remote targets:

As a final note, please remember that the debug server opens a security whole on the target systems, since the client debugger can execute any program that the user who started the server can.

Monday, June 02, 2008

Take the Fork in the Road

A friend of mine (let's call him Andrei) who works on the D Programming Language sent me a bug report the other day. He was having trouble debugging this piece of D code with ZeroBUGS:


import std.stdio, std.process, std.string;
void main(string[] args)
{
    writeln("Started on ", chomp(shell("hostname --short")));
}

After a brief investigation, I figured out what was going on: the shell call (equivalent to system in C) was triggering a fork(), followd by an exec() call. The debugger automatically attached to the spawned shell. After which point, every time my friend was trying to step through the code, control would jump from one process to the other in an apparent indeterministic fashion (I say apparent because the behavior is actually determined by how the kernel schedules the main, forked, and debugger processes).

I replied to the bug report with an explanation of what was going on, suggesting that the --no-trace-fork switch be passed in the command line (and thus avoid attaching to the forked shell).

My friend argued that this option should be more obvious (i.e. accessible from the graphical user interface) and while you are at it, he said, why not add a "spawn on fork" option, so that the two processes can be debugged in separate windows that do not interfere which each other?

I thought that was a great suggestion, and added (albeit only in the commercially-supported version) two new check-buttons to the Language tab in the Options dialog.

Caveat: Currently, the buttons are grayed out while a program is being debugged (i.e. are enabled only when no target is loaded in the debugger).

(This shortcoming has to do with setting ptrace options down a tree of attached threads, and I hope to address in a future release).

New releases of ZeroBUGS will be available later this week, when I launch the revamped website.

Thursday, May 22, 2008

Any Updates?

The zero-bugs.com site is about to be revamped, and the side picture should give you a sense of the new look and feel.

I had tremendous fun indulging my artistic side and created the new design using Inkscape 0.45 and Gimp (except for the product box).

The new site's main goal is to better manage software releases. A new back end component allows users to check for software updates (the debugger UI now has a new menu entry: Help --> Check for Updates).

I have improved my uploading script (which I have been using for pushing the software to the site); meta information is updated on the server side each time a binary is uploaded. The meta information is used by the "check for updates" feature; it is also used by a CGI to dynamically generate the download pages, so that users always download the most up-to-date code.

The debugger program connects to the server every time a user clicks on the "check for update" menu, and consults the meta information. Because the format of the server files containing software meta information is dictated by my uploading script, I did not want to hard-code the way these are read. Instead, I created a new client-side Python program.

I wanted the update mechanism to be as generic as possible, and allow for plug-ins to be updated independently, should such a need arise (for example when plug-ins from third-parties are included in the distribution).

To this end, I added a couple of new interfaces which look something like this:


struct Update : public ZObject
{
 /**
  * URL of the package (DEB, RPM, etc.) that contains an update.
  */
 virtual const char* url() const = 0;

 /**
  * Description of changes in this update.
  */
 virtual const char* description() const = 0;

 virtual void apply() = 0;

 virtual Update* copy() const = 0;
};


struct Updateable : public ZObject
{
 virtual size_t check_for_updates(Enumerator<update*>*) = 0;
};

(The ZObject and the Enumerator are building blocks from my ZDK -- Zero Developer Toolkit -- on which the ZeroBUGS is built. They are not essential for the purpose of this post).

The way the update works is very simple: the engine queries each plugin for the Updateable interface. If the interface is detected, the check_for_updates method is invoked; the Enumerator then populates a container of pointers to Update objects, and the URL of each update is displayed in a UI HTML control.

With this simple, generic mechanism, each pluggable component can implement the details for its own update.

My concrete implementation lives in the Python scripting module which acts as a gateway between the bulk of the debugger code, written in C++, and extension written in Python.

The client-side script is ridiculously simple:


from datetime import datetime
import httplib
import urllib
import zero

server="www.zero-bugs.com"
published_dir="/8001/published/"

def check_for_updates(sysid, date):
    print "checking updates for:", sysid, date
    conn = httplib.HTTPConnection(server)
    url = urllib.quote(published_dir + sysid)
    conn.request("GET", url)
    r = conn.getresponse()
    info = r.read().split("\n")
    conn.close()
    my_build_date = datetime.strptime(date, "%Y-%m-%d")
    build_date = datetime.strptime(info[0], "%Y-%m-%d")

    if my_build_date < build_date:
        url = "http://" + server + info[1]
        info.pop(0) #pop the date
        info.pop(0) #pop the url
        info[0] = info[0] + "<br/>"
        desc = "\n".join(info)
        #
        # functions returns a list of available updates
        #
        return [ zero.Update(url, desc) ]

I plan to launch the new site after Memorial Day. And after that I plan to release a new feature: support for remote debugging. I have written a minimalistic server and a companion plug-in that will allow users to debug machines where resources scarcity does not permit a full-fledged ZeroBUGS installation (you did not think that I took a three week hiatus from my blog just to re-write a website, heh).

Sunday, April 27, 2008

Super! Delegates

I have added support for D programming language delegates to the ZeroBUGS debugger!

And I am in the process of rolling out beta binaries, and a new release for Ubuntu 8.04 (Hardy Heron).

Happy (Eastern Orthodox) Easter!

Tuesday, April 15, 2008

Is it the Blogosphere, Stupid?

Oh Lord, it is so hard and stressful being a blogger; and poor bloggers may die of stress. This is what some recent "breaking news" wants you to believe.

First of all, stress is bad for one's health in any activity (I almost wrote "line of work"), what's the big news?

Secondly, blogging may beat doing real work but it is not work. Who the fudge are we kidding? It's like saying that keeping a journal is work. No, you just jot down ideas or things that you find interesting or worth capturing (perhaps when you wind down at the end of the day). You write about a project you are working on, a hobby, or something that bothers you; or something meaningful in your life.

How many people do you know that are enjoying some level of success (politicians, writers, software engineers, doctors, lawyers, you pick) who blog several times a day, every day? Ok, politicians may be a bad example since they have minions blogging for them.

Web 2.0 turned blogging into a full time "job"; self-proclaimed experts "cover" the markets, the political scene, celebrity gossip, and technology. When have these people had any time to build any expertise when every single day they blog till they drop? Do they ever stop and think about what they are writing or merely smell each other's farts?

Instead of building things (and, why not, selling for a profit), modern society "generates content", begging for somebody's mouse click; and then everybody and their grandma complains (in a blog) about the economy being in decline.

And I bet that whatever few mouse clicks this blog entry has trapped, they are coming from hungry blogger dudes, avidly searching the internet for ideas to expertly comment on. Dude, put your flip-flops on and go get a job. A real one.

Thursday, April 10, 2008

An Exceptional Chain of Events

It was said of yore that throwing an exception from a C++ destructor is a verie wycked and wofull thyng.

Remember goto? Some distinguished gentlemen of the C++ programming trade look down on throwing from destructors as being eviler than using a goto. Standing base for such moral a judgment is that "a throwing destructor makes it impossible to write correct code and can shut down your application without any warning" .

Destructors in C++ are one essential ingredient to the Resource Acquisition Is Initialization (RAII) idiom, which can be resumed as follows: Resources (i.e. memory, sockets, file handles, etc) are acquired during the construction of an object. If an exception is thrown from anywhere in the code during the lifetime and after the object has been fully constructed, the C++ language rules guarantee that the object's destructor will be summoned.

Thus resources can be released gracefully and graciously by said destructor, even in the distasteful and exceptional eventuality of... an exception. The Programmer is relieved from the burdensome duty of having to release resources on every single possible code path.

RAII relieves the programmer from manually preventing leaks.
(Notice how nicely relieves and leaks go together).

If a destructor, invoked as part of the automatic cleanup that entails the throwing of an exception, decides to throw its own exception, then the system has but two choices: to go into an ambiguous state (now there are two outstanding exceptions, the original one, plus the destructor-thrown one) or... throw the towel.

It is C++ Standard behavior to follow the latter course: the towel is thrown from around the waist, the user is mooned and the program calls terminate(). Hasta la Vista, Baby. And for this reason, they say your destructors should never throw.

But what if there's not way to recover?

Let's consider a generic Lock object which may look something like this:


template<typename>
class Lock : boost::noncopyable
{
 T& mx_;

public:
 ~Lock() throw()
 {
      mx_.leave();
 }
 explicit Lock(T& mx) : mx_(mx)
 {
      mx_.enter();
 }
};

The problem with the code above is that T::leave() may throw an exception (it may as well not throw, but one really cannot tell, since T is a template parameter).

An so I come to the conclusion of this post. I assert that the code above is just as good as it gets. Of course, T may be bound at compile time to a class that implements the leave method somewhat like this:


void Mutex::leave()
{
 if (int resultCode = pthread_mutex_unlock(&mutex_))
 {
     throw pthread_runtime_error(resultCode);
 }
}

If unlocking the resource fails, then a) something really bad must've happened (possibly a memory corruption?) and b) there is little, if anything, to do about restoring the system to a stable state.

I say let ye programme crash and burne.

What do You reckon?

Friday, April 04, 2008

Test Drive: GCC 4.4.0

I have finally gotten around to test ZeroBUGS with a C++0x supporting compiler.

Even though said support is labeled as experimental and is not activated by default (-std=c++0x in the command line does the trick), GCC 4.4.0 seems to be working just fine. Or I should rather say that Zero seems to be debugging GCC 4.4.0-generated code just fine.

I have installed the latest compiler snapshot on a Ubuntu 8.04 beta system, by simply typing apt-get install gcc-snapshot (apt is the awesomest).

Then I set the compiler to /usr/lib/gcc-snapshot/bin/g++ in my environment and fired up a battery of automated tests.

I also did a bit of manual testing for variadic templates, using code samples straight out of Wikipedia, and one quick test for rvalue references. Things look good so far.

Surprisingly, all unit tests having to do with floating point variables passed.

I may have to look deeper into these tests, since as stated here,

"Starting from GCC 4.3.1, decimal floating point variables are aligned to their natural boundaries when they are passed on stack for i386."

I will keep you posted.

Thursday, April 03, 2008

Random Log Entries

Ten years ago I used to carry a bulky day planner, and like in a Seinfeld episode almost bought a man-purse for it.

My handwriting sucks. I hear some schools are dropping cursive from their curricula.

Keyboards.

Instant messaging (thumbs up for opposable thumbs!)

Digital scrapbooks. Cloud(ed) computing memories.

Virtual social networking.

What will future historians and archaeologists think of us if the decoder ring for the Internet is lost?

Songs about programming that I would like to hear:

In a software continuum (you are my Heisenbug)
Perpetuum beta mobile
The universal constant is volatile (in C flat)
Debuggers don't step in the same river twice
Flossing bugs at dawn
Recombobulate (your shattered assertions)

Sunday, March 30, 2008

The Final Bugs

The commercial version of ZeroBUGS for Ubuntu Linux has been updated with a new custom look and fixes a couple of minor bug.

Coming up next in the pipe line are a professional edition (bundling 32 and 64 bit versions) and an enterprise edition (with a multiple seat license).

And I will continue experimenting in the beta branch with support for the D language. In fact, I just have received a test build of DMD 2.014 which should fix the DW_TAG_module issue which I discussed in a previous post.

Thursday, March 27, 2008

Wishing for Some German Engineering

After giving a try to OpenSuse 11 alpha 3 today, which its EULA calls (optimistically?) a BETA, it hurts me to say that I am disappointed.

Some minor cosmetic issues aside (fonts that do not align properly on my laptop screen) I could not install the darn thing.

The graphical installer kept saying that there are 0 packages selected for install, which will take just about 0% of my drive space. Really? No kidding.

And some lists and menus displaying in stylish white on white enhanced my experience (read: confusion).

It is however a big improvement over Ubuntu 8.04's beta, since Suse at least boots up on my laptop. The main reasons I would love to have a working OpenSuse installation are:

currently, the lappy runs Mandriva 2007.0 and I love French engineering like a dead escargot in my coffee;
Suse 11 is supposed to come with Gnome 2.22 and I want to check out the API improvements in gtksourceviewmm 2 (since the gnome 2.0 version left out important features such as markers);
Suse 11 promises GCC 4.3, which has exciting features such as variadic C++ templates;
I love the Green Graphics!

But I guess bullets two and three are also feasible with Hardy Heron (although GCC 4.3 is not the default compiler, and you have to invoke gcc-snapshot to get to it).

But then again, Ubuntu hates my DV6000z laptop. Or vice versa.

Tuesday, March 25, 2008

Debugging D Unit Tests

I had one slide in my presentation at the D Programming Language Conference last year that demonstrated how to automatically insert breakpoints at all unit test functions in a D program (using, of course, the ZeroBUGS debugger for Linux).

The trick mainly consists in invoking a small Python script at startup:

zero --py-run=dunit.py

where the contents of dunit.py may look something like this:


# dunit.py
import os

import re
import zero

def on_table_done(symTable):
 extension = re.compile(".d$")
 process = symTable.process()
 module = symTable.module()

 for unit in module.translation_units():

     if unit.language() == zero.TranslationUnit.Language.D:
         name = os.path.basename(unit.filename())

         i = 0
         while True:
             symName = "void " + extension.sub('.__unittest%d()' % i, name)

             print symName
             matches = symTable.lookup(symName)
             if len(matches) == 0:
                 break
             for sym in matches:
                 process.set_breakpoint(sym.addr())
             i += 1

(see http://zero-bugs.com/python.html for more information on the scripting support in the debugger).

This contortion is painful but necessary with D, because unit tests have the interesting property of running before anything else in the program.

Without this bit of gymnastics, by the time the debugger (trusted hunting companion) is wagging its tail at the main function, the unit test functions have long completed (or failed).

The script line shown above in bold (which constructs the symbol name of the unit tests) is now broken because of a change in the D 2.0 language. The name is prefixed by the module, as explained here: http://www.digitalmars.com/d/2.0/hijack.html

What happens is that the scripts is looking for functions named
void Blah.__unittest0() , void Blah.__unittest1() , etc whereas they are now called void mymodule.Blah.__unittest0(), void mymodule.Blah.__unittest1(), and so on.

Before, I could synthesize the symbol names to look for, but now this is hardly possible since there's no place to infer mymodule from, unless the compiler produces the (standard) DW_TAG_module DWARF debug info entry.

Unfortunately the DMD compiler does not generate a DW_TAG_module, and so it is almost impossible for the debugger to determine the module name. One possible hack would be to sniff other symbols in the module and look at their prefixes.

Or we could change the script like so:


import os
import re
import zero

def on_table_done(symTable):
    extension = re.compile('.d$')

    process = symTable.process()
    module = symTable.module()

    for unit in module.translation_units():
        if unit.language() == zero.TranslationUnit.Language.D:
            matches = symTable.lookup("_moduleUnitTests")
            if len(matches) == 0:
                break
            for sym in matches:
                process.set_breakpoint(sym.addr()

This stops the program in the debugger a couple of function calls before the unit test functions are executed. However it is not ideal since now we have to step over a bunch of assembly code.

Ideally, the D compiler should produce DW_TAG_module info.

Or the debugger could look up symbol names by regular expressions, but I have to punt on the performance implications of such an approach.on the performance implications of such an approach.on the performance implications of such an approach.

Sunday, March 23, 2008

yywrap it up

This weekend I decided to download the beta version of the highly anticipated Ubuntu 8.04.

After installing it on a virtual machine I promptly proceeded to building ZeroBUGS on it.

And I ran into a small roadblock: The debugger has a built-in C/C++ interpreter, for evaluating simple expressions and function calls. This code failed to link, because in this new Hardy Heron version of Ubuntu the yyFlexLexer class (see /usr/include/FlexLexer.h) has a virtual method called yywrap(). In older versions of flex yywrap used to be a standalone, extern "C" function.

To work around this issue in a portable manner I defined a new function in my configure.ac file.

It attempts to compile a small C++ program that assigns a pointer to method. If the compilation fails, I assume that yyFlexLexer::yywrap is not defined.

Here's the code:


AC_DEFUN([AC_YYFLEXLEXER_YYWRAP],
  [AC_CACHE_CHECK([for yyFlexLexer::yywrap],
             [ac_cv_yyFlexLexer_yywrap],
             [ac_cv_yyFlexLexer_yywrap=false
cat > tmp.cpp << ---end---

#include <FlexLexer.h>
int main()
{
  int (yyFlexLexer::*method)() = &yyFlexLexer::yywrap;
  return 0;
}
---end---
  if g++ tmp.cpp 2>/dev/null; then
      ac_cv_yyFlexLexer_yywrap=true
      rm a.out
  fi
  rm tmp.cpp])

  if test "$ac_cv_yyFlexLexer_yywrap" = true; then
      AC_DEFINE([HAVE_YYFLEXLEXER_YYWRAP], [1],
              [Define if yyFlexLexer::yywrap is declared])
  fi
])


AC_YYFLEXLEXER_YYWRAP

And you can test drive a pre-release of ZeroBUGS for Ubuntu 8.04 here: http://zero-bugs.com/8001/releases/zero_1.12-heron-032308_i386.deb

The md5sum is: 5b80e9f0f867b858620c740096e432dd

Saturday, March 22, 2008

attribute((language_safety("on")))

Musing over Bartosz Milewski's Safe D article, I was a bit surprised to learn how much effort programming language designers are putting into building safe language subsets.

A safe subset aims to prevent users from committing such unspeakable atrocities like buffer overruns, and dereferencing invalid pointers.

It is also surprising that a vast majority of C++ developers (and ex C++ developers that defected to C#) consider C++ to be an unsafe language. From my few data points, these are folks that have picked the language in the mid 90's and never updated their skills (they are still running their brains on Windows 95, so to speak).

I see developers grouped in two main schools: the low level C guys (encompassing the Linux crowd) and the Java / .NET corporate code monkeys.

Not a productive bug writer myself (on a good week I turn about five) I may not be qualified to speak. It has been a long time since I overran a memory buffer. These days most of my bugs have to do with multi-threading.

From the day I have switched from C habits to the C++ style of using STL containers and smart pointers my buffer overruns have decreased to zero Kelvin.

Shared_ptr (in either its boost or TR1 incarnation) is another useful friend.

Northwest C++ Users Group hosted a very good presentation on Wednesday: Stephan T Lavavej did an impressive tour of shared_ptr, useful both to beginners and to the more seasoned developers.

As an event coordinator, I had the chance to chat with Charles Zapata of Ontela, a hot up and coming downtown Seattle startup (who generously sponsored the Wednesday night meeting).

After the talk, Bartosz, Andrei Alexandrescu, Walter Bright, and a loose knit of groupies (including yours truly) and NWCPP guests such as prominent Advanced Windows Debugging author Dan Pravat went to a local watering hole to discuss the future of the D Programming language and shoot the breeze over a pint. Cheers!

CC... Clippings

"All these features make C++ an ideal language for writing operating systems." -- Bartosz Milewski, on SafeD

"C++ is a horrible language. It’s made more horrible by the fact that a lot
of substandard programmers use it, to the point where it’s much much
easier to generate total and utter crap with it. Quite frankly, even if
the choice of C were to do *nothing* but keep the C++ programmers out,
that in itself would be a huge reason to use C." -- Linus Torvalds

According to Time Magazine's latest edition, Linus Torvalds is officially a hero! Linus is cited in the "Rebels & Leaders" category along with Nelson Mandela, Margaret Thatcher, Mikhail Gorbachev and others. In the article 60 Years of Heroes, Linus Torvlads was selected one of the heroes of the past 60 years -- Softpedia

"Heroes become imbued with virtues we wish on them" -- James Burke, American Connections

"As usual, this being a 1.3.x release, I haven't even compiled this kernel yet. So if it works, you should be doubly impressed." -- Linus Torvalds, announcing kernel 1.3.3 on the linux-kernel mailing list.

"Whereas most people find programming in C++ a chore, most Objective-C programmers find the language to be a joy." -- Steve Jobs and the History of Cocoa

Friday, March 21, 2008

Mostly For Fun

The past weekend I worked on setting up the e-commerce bits for my ZeroBUGS Linux debugger.

While the commercial build is compiled with optimizations turned on, and has some extra UI goodies, the beta builds are still available for free.

I was doing some marketing research and I came across an interview bit where the main, revolving question was "What's so special about your product that big system vendors do not offer?"

The person being interviewed went on and on trying to explain the great virtues of his company's software, but missed the simple answer: there are classes of products where a small ISV has a much better chance to come up with good software than a large corporation.

Take my example of a debugger for the C++ language. Unlike a browser or an OS, it is designed for a relatively small yet highly specialized niche. It is not a mass consumer product; it will never be a household name. Its users are few, but extremely picky (since they are very sophisticated software developers themselves). This spells: low margin. Not what a large corporation would bet its resources on.

A big company may however assign a team to slapping something together quickly, most likely to bundle it with another product (such as a compiler) so that they can check a box in a feature list; or maybe just give it away "for free" so that they can sell you their pricey consulting services.

It is rather the kind of product a geek would do primarily just for fun, and only secondarily to make money (so that he can eat, buy more hardware, and do more of the same).

And that's why in the long run the geek's code is better.

Saturday, March 08, 2008

D Compiler Released as DEB Package

Walter Bright and his collaborators released a new version of the Digital Mars D Compiler: http://www.digitalmars.com/d/download.html

A DEB package for Ubuntu, generated with the kit that I put together and announced in a previous post is available for download.

Monday, March 03, 2008

Buy New!

I have no idea whether my old buddies at Amazon have a purposeful sense of humor, but at any rate: buying new toilet paper sounds like a good idea, considering the alternatives...

Watch Expressions

This past weekend I added a couple of new features to the ZeroBUGS debugger.

The user can now monitor arbitrary expressions in the "Watch Variables" window (not just variables, as it was the case up until now). For example you can watch for things such as argv[1], or x * y. (Function calls are supported but I would recommend extreme care in using them; calling arbitrary code in the debugged target may yield some rather interesting effects).

Another new feature is the ability to "map" source and library paths. For example, lets assume that you built your code in /home/joe/cool_project but later moved the source code to /home/joe/cool_source. The debug information inside the binary will still point to /home/joe/cool_project. ZeroBUGS allows you to work around this by setting an environment variable like this:

export ZERO_PATH_MAP="/home/joe/cool_project:/home/joe/cool_source;"

When the debugger encounters source file information prefixed by /home/joe/cool_project it will replace it with /home/joe/cool_source.

You may specify an unlimited number of such pairs, separated by a semicolon. I have uploaded new builds for Ubuntu (32 and 64 bit) and Fedora Core 8 (32bit) this morning, and will update the others over the next week or two.

And the total number of code lines has dropped from 142,992 as reported at the end of December, down to:


Total Physical Source Lines of Code (SLOC)                = 139,824
Development Effort Estimate, Person-Years (Person-Months) = 35.80 (429.61)
(Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months)                         = 2.09 (25.03)
(Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule)  = 17.16
Total Estimated Cost to Develop                           = $ 4,836,180
(average salary = $56,286/year, overhead = 2.40).
SLOCCount, Copyright (C) 2001-2004 David A. Wheeler
SLOCCount is Open Source Software/Free Software, licensed under the GNU GPL.
SLOCCount comes with ABSOLUTELY NO WARRANTY, and you are welcome to
redistribute it under certain conditions as specified by the GNU GPL license;
see the documentation for details.
Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."

The decrease is mainly due to code refactoring and to removal of unsuccessful experimental features.

Thursday, February 28, 2008

Zero G

Wow. It has been quite a while since my last post. In my defense I can claim that I have been hugely busy, both at my day job and with Zero.

I fixed a few crashes (thanks to all of you who took the time to submit bug reports) and added a new layout for organizing the debugger window. The user could choose between a "classic" and a "tabbed" view, now there is a third option that allows local variables and stack traces to be viewed at the same time.

Another cool, new feature consists of displaying a description of the current errno value in a tool-tip (as shown in the screen shot).

When I do not work on software I spend time with my family. And when I manage to escape their love I read science news magazines (or at least that's what they label themselves) just to see how deep the conspiracy runs. What conspiracy? You are kidding, right?

In the last issue of Seed, Will Self and Spencer Wells agree that

"we'll probably fast forward by changing our own DNA"

The human race will leap forward using technology to pick up where evolution left... Wait. Isn't this what Ray Kurzweil was saying all along? That we will transcend our biology?

The ultimate goal of evolution be as it may for God's Debris to put itself back together, I believe that the purpose of spreading "news" like these is to get more people to watch The Sarah Connor Chronicles.

Okay, maybe I am kidding. Or maybe someone sits up there in the control room as we very speak, and fabricates more news on, say Global Warming (be afraid, vewy vewy afraid), or on Emmanuelle finding the G spot! No seriously, what kind of news is this? (Even the New Scientist covered it at length). Sorry to break it to you guys, Emmanuelle found the G spot in the seventies, this is old news.

Oh, oops. Just in: Doctor Emmanuelle is a man.

Monday, February 04, 2008

The New ZeroBUGS Python Console

This is the latest feature that I have been working on: the Python Console, which allows users to interactively script the debugger.

The scripting part is not new, but I always felt that it was largely underutilized. It seems that no matter how cool, a feature that is not obvious to the user is simply wasted. Hopefully this new UI element will remedy the situation.

Saturday, January 26, 2008

Package Generator for the D Compiler

Last week or so Andrei asked me if I can take a look at making a deb file for Walter Bright's D Compiler, since the current distribution format (.zip) is hard to install on Linux.

I have no idea how to make debs, and I have no desire whatsoever to learn, thank you, I would rather spend my time reading the Beer Advocate website. I learned a new phrase, by the way: experienced beer drinker. I think it sounds awesome.

I crafted instead a quick project to generate RPMs for DMD, which then get converted automatically into a deb package via alien (same process I use for ZeroBUGS).

Here it is, enjoy.