Sunday, March 30, 2008

The Final Bugs

The commercial version of ZeroBUGS for Ubuntu Linux has been updated with a new custom look and fixes a couple of minor bug.

Coming up next in the pipe line are a professional edition (bundling 32 and 64 bit versions) and an enterprise edition (with a multiple seat license).

And I will continue experimenting in the beta branch with support for the D language. In fact, I just have received a test build of DMD 2.014 which should fix the DW_TAG_module issue which I discussed in a previous post.

Thursday, March 27, 2008

Wishing for Some German Engineering

After giving a try to OpenSuse 11 alpha 3 today, which its EULA calls (optimistically?) a BETA, it hurts me to say that I am disappointed.

Some minor cosmetic issues aside (fonts that do not align properly on my laptop screen) I could not install the darn thing.

The graphical installer kept saying that there are 0 packages selected for install, which will take just about 0% of my drive space. Really? No kidding.

And some lists and menus displaying in stylish white on white enhanced my experience (read: confusion).

It is however a big improvement over Ubuntu 8.04's beta, since Suse at least boots up on my laptop. The main reasons I would love to have a working OpenSuse installation are:
  • currently, the lappy runs Mandriva 2007.0 and I love French engineering like a dead escargot in my coffee;
  • Suse 11 is supposed to come with Gnome 2.22 and I want to check out the API improvements in gtksourceviewmm 2 (since the gnome 2.0 version left out important features such as markers);
  • Suse 11 promises GCC 4.3, which has exciting features such as variadic C++ templates;
  • I love the Green Graphics!
But I guess bullets two and three are also feasible with Hardy Heron (although GCC 4.3 is not the default compiler, and you have to invoke gcc-snapshot to get to it).

But then again, Ubuntu hates my DV6000z laptop. Or vice versa.

Tuesday, March 25, 2008

Debugging D Unit Tests

I had one slide in my presentation at the D Programming Language Conference last year that demonstrated how to automatically insert breakpoints at all unit test functions in a D program (using, of course, the ZeroBUGS debugger for Linux).

The trick mainly consists in invoking a small Python script at startup:
zero --py-run=dunit.py
where the contents of dunit.py may look something like this:

# dunit.py
import os

import re
import zero

def on_table_done(symTable):
extension = re.compile(".d$")
process = symTable.process()
module = symTable.module()

for unit in module.translation_units():

if unit.language() == zero.TranslationUnit.Language.D:
name = os.path.basename(unit.filename())

i = 0
while True:
symName = "void " + extension.sub('.__unittest%d()' % i, name)

print symName
matches = symTable.lookup(symName)
if len(matches) == 0:
break
for sym in matches:
process.set_breakpoint(sym.addr())
i += 1

(see http://zero-bugs.com/python.html for more information on the scripting support in the debugger).

This contortion is painful but necessary with D, because unit tests have the interesting property of running before anything else in the program.

Without this bit of gymnastics, by the time the debugger (trusted hunting companion) is wagging its tail at the main function, the unit test functions have long completed (or failed).

The script line shown above in bold (which constructs the symbol name of the unit tests) is now broken because of a change in the D 2.0 language. The name is prefixed by the module, as explained here: http://www.digitalmars.com/d/2.0/hijack.html

What happens is that the scripts is looking for functions named
void Blah.__unittest0() , void Blah.__unittest1() , etc whereas they are now called void mymodule.Blah.__unittest0(), void mymodule.Blah.__unittest1(), and so on.

Before, I could synthesize the symbol names to look for, but now this is hardly possible since there's no place to infer mymodule from, unless the compiler produces the (standard) DW_TAG_module DWARF debug info entry.

Unfortunately the DMD compiler does not generate a DW_TAG_module, and so it is almost impossible for the debugger to determine the module name. One possible hack would be to sniff other symbols in the module and look at their prefixes.

Or we could change the script like so:

import os
import re
import zero

def on_table_done(symTable):
extension = re.compile('.d$')

process = symTable.process()
module = symTable.module()

for unit in module.translation_units():
if unit.language() == zero.TranslationUnit.Language.D:
matches = symTable.lookup("_moduleUnitTests")
if len(matches) == 0:
break
for sym in matches:
process.set_breakpoint(sym.addr()


This stops the program in the debugger a couple of function calls before the unit test functions are executed. However it is not ideal since now we have to step over a bunch of assembly code.

Ideally, the D compiler should produce DW_TAG_module info.

Or the debugger could look up symbol names by regular expressions, but I have to punt on the performance implications of such an approach.on the performance implications of such an approach.on the performance implications of such an approach.

Sunday, March 23, 2008

yywrap it up

This weekend I decided to download the beta version of the highly anticipated Ubuntu 8.04.

After installing it on a virtual machine I promptly proceeded to building ZeroBUGS on it.

And I ran into a small roadblock: The debugger has a built-in C/C++ interpreter, for evaluating simple expressions and function calls. This code failed to link, because in this new Hardy Heron version of Ubuntu the yyFlexLexer class (see /usr/include/FlexLexer.h) has a virtual method called yywrap(). In older versions of flex yywrap used to be a standalone, extern "C" function.

To work around this issue in a portable manner I defined a new function in my configure.ac file.

It attempts to compile a small C++ program that assigns a pointer to method. If the compilation fails, I assume that yyFlexLexer::yywrap is not defined.

Here's the code:

AC_DEFUN([AC_YYFLEXLEXER_YYWRAP],
[AC_CACHE_CHECK([for yyFlexLexer::yywrap],
[ac_cv_yyFlexLexer_yywrap],
[ac_cv_yyFlexLexer_yywrap=false
cat > tmp.cpp << ---end---

#include <FlexLexer.h>
int main()
{
int (yyFlexLexer::*method)() = &yyFlexLexer::yywrap;
return 0;
}
---end---
if g++ tmp.cpp 2>/dev/null; then
ac_cv_yyFlexLexer_yywrap=true
rm a.out
fi
rm tmp.cpp])

if test "$ac_cv_yyFlexLexer_yywrap" = true; then
AC_DEFINE([HAVE_YYFLEXLEXER_YYWRAP], [1],
[Define if yyFlexLexer::yywrap is declared])
fi
])


AC_YYFLEXLEXER_YYWRAP


And you can test drive a pre-release of ZeroBUGS for Ubuntu 8.04 here: http://zero-bugs.com/8001/releases/zero_1.12-heron-032308_i386.deb

The md5sum is: 5b80e9f0f867b858620c740096e432dd

Saturday, March 22, 2008

__attribute__((language_safety("on")))

Musing over Bartosz Milewski's Safe D article, I was a bit surprised to learn how much effort programming language designers are putting into building safe language subsets.

A safe subset aims to prevent users from committing such unspeakable atrocities like buffer overruns, and dereferencing invalid pointers.

It is also surprising that a vast majority of C++ developers (and ex C++ developers that defected to C#) consider C++ to be an unsafe language. From my few data points, these are folks that have picked the language in the mid 90's and never updated their skills (they are still running their brains on Windows 95, so to speak).

I see developers grouped in two main schools: the low level C guys (encompassing the Linux crowd) and the Java / .NET corporate code monkeys.

Not a productive bug writer myself (on a good week I turn about five) I may not be qualified to speak. It has been a long time since I overran a memory buffer. These days most of my bugs have to do with multi-threading.

From the day I have switched from C habits to the C++ style of using STL containers and smart pointers my buffer overruns have decreased to zero Kelvin.

Shared_ptr (in either its boost or TR1 incarnation) is another useful friend.

Northwest C++ Users Group hosted a very good presentation on Wednesday: Stephan T Lavavej did an impressive tour of shared_ptr, useful both to beginners and to the more seasoned developers.

As an event coordinator, I had the chance to chat with Charles Zapata of Ontela, a hot up and coming downtown Seattle startup (who generously sponsored the Wednesday night meeting).

After the talk, Bartosz, Andrei Alexandrescu, Walter Bright, and a loose knit of groupies (including yours truly) and NWCPP guests such as prominent Advanced Windows Debugging author Dan Pravat went to a local watering hole to discuss the future of the D Programming language and shoot the breeze over a pint. Cheers!

CC... Clippings

"All these features make C++ an ideal language for writing operating systems." -- Bartosz Milewski, on SafeD
"C++ is a horrible language. It’s made more horrible by the fact that a lot
of substandard programmers use it, to the point where it’s much much
easier to generate total and utter crap with it. Quite frankly, even if
the choice of C were to do *nothing* but keep the C++ programmers out,
that in itself would be a huge reason to use C." -- Linus Torvalds
According to Time Magazine's latest edition, Linus Torvalds is officially a hero! Linus is cited in the "Rebels & Leaders" category along with Nelson Mandela, Margaret Thatcher, Mikhail Gorbachev and others. In the article 60 Years of Heroes, Linus Torvlads was selected one of the heroes of the past 60 years -- Softpedia
"Heroes become imbued with virtues we wish on them" -- James Burke, American Connections
"As usual, this being a 1.3.x release, I haven't even compiled this kernel yet. So if it works, you should be doubly impressed." -- Linus Torvalds, announcing kernel 1.3.3 on the linux-kernel mailing list.
"Whereas most people find programming in C++ a chore, most Objective-C programmers find the language to be a joy." -- Steve Jobs and the History of Cocoa

Friday, March 21, 2008

Mostly For Fun

The past weekend I worked on setting up the e-commerce bits for my ZeroBUGS Linux debugger.

While the commercial build is compiled with optimizations turned on, and has some extra UI goodies, the beta builds are still available for free.

I was doing some marketing research and I came across an interview bit where the main, revolving question was "What's so special about your product that big system vendors do not offer?"

The person being interviewed went on and on trying to explain the great virtues of his company's software, but missed the simple answer: there are classes of products where a small ISV has a much better chance to come up with good software than a large corporation.

Take my example of a debugger for the C++ language. Unlike a browser or an OS, it is designed for a relatively small yet highly specialized niche. It is not a mass consumer product; it will never be a household name. Its users are few, but extremely picky (since they are very sophisticated software developers themselves). This spells: low margin. Not what a large corporation would bet its resources on.

A big company may however assign a team to slapping something together quickly, most likely to bundle it with another product (such as a compiler) so that they can check a box in a feature list; or maybe just give it away "for free" so that they can sell you their pricey consulting services.

It is rather the kind of product a geek would do primarily just for fun, and only secondarily to make money (so that he can eat, buy more hardware, and do more of the same).

And that's why in the long run the geek's code is better.

Saturday, March 08, 2008

D Compiler Released as DEB Package

Walter Bright and his collaborators released a new version of the Digital Mars D Compiler: http://www.digitalmars.com/d/download.html

A DEB package for Ubuntu, generated with the kit that I put together and announced in a previous post is available for download.

Monday, March 03, 2008

Buy New!

I have no idea whether my old buddies at Amazon have a purposeful sense of humor, but at any rate: buying new toilet paper sounds like a good idea, considering the alternatives...

Watch Expressions

This past weekend I added a couple of new features to the ZeroBUGS debugger.

The user can now monitor arbitrary expressions in the "Watch Variables" window (not just variables, as it was the case up until now). For example you can watch for things such as argv[1], or x * y. (Function calls are supported but I would recommend extreme care in using them; calling arbitrary code in the debugged target may yield some rather interesting effects).

Another new feature is the ability to "map" source and library paths. For example, lets assume that you built your code in /home/joe/cool_project but later moved the source code to /home/joe/cool_source. The debug information inside the binary will still point to /home/joe/cool_project. ZeroBUGS allows you to work around this by setting an environment variable like this:

export ZERO_PATH_MAP="/home/joe/cool_project:/home/joe/cool_source;"

When the debugger encounters source file information prefixed by /home/joe/cool_project it will replace it with /home/joe/cool_source.

You may specify an unlimited number of such pairs, separated by a semicolon. I have uploaded new builds for Ubuntu (32 and 64 bit) and Fedora Core 8 (32bit) this morning, and will update the others over the next week or two.

And the total number of code lines has dropped from 142,992 as reported at the end of December, down to:

Total Physical Source Lines of Code (SLOC) = 139,824
Development Effort Estimate, Person-Years (Person-Months) = 35.80 (429.61)
(Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months) = 2.09 (25.03)
(Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule) = 17.16
Total Estimated Cost to Develop = $ 4,836,180
(average salary = $56,286/year, overhead = 2.40).
SLOCCount, Copyright (C) 2001-2004 David A. Wheeler
SLOCCount is Open Source Software/Free Software, licensed under the GNU GPL.
SLOCCount comes with ABSOLUTELY NO WARRANTY, and you are welcome to
redistribute it under certain conditions as specified by the GNU GPL license;
see the documentation for details.
Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."



The decrease is mainly due to code refactoring and to removal of unsuccessful experimental features.