Wednesday, March 25, 2009

Another Perl in the Wall

The most successful computer languages out there were born out of concrete problems: Perl in the beginning was nothing more than a Reporting Language; C came out of the need of writing OS-es in a portable fashion; PHP emerged of somebody's need for expressing dynamic web content as server-side macros. C# solves the problem of doing component programming without intricate COM knowledge and 2 PhD-s per developer.

Typically, after the problem is solved, and the engineers scratched their itch, wrote the code, and shipped the (working!) products, some rather academic type decides: "Now I am going to redo it the right way" (and that's how UNIX got rewritten as Plan 9, a great OS that nobody uses).

Interestingly enough, the redesigned products rarely end up being as successful as the original. People have been trying for years to replace things such as Windows, Office and the C programming language with "better" rewrites, but the market does not seem to care much. If it was good enough the first time around, it got adopted and used. Who cares how neat (or messy) the internal plumbings are?

The C and C++ languages are great for doing low level systems programming; they may be harder to use for constructing applications, and I would definitely advise against using C++ for web development. The D programming language is fancying itself as a better C++ and I think that is true in the application development area. But I do not see D as a systems language. I will never write a task scheduler in a garbage-collected language. When I write system level stuff, I want the good old WYSIWYG behavior: no garbage collector thread running willy-nilly and no strange array slices (that are like arrays except for when they aren't). And thanks but no thanks, I want no memory fences, no thingamajica inserted with parental care by the compiler to protect me from shooting my own foot on many-core systems. That is the point of systems programming: I want the freedom to shoot myself in the foot.

I have been trying (unsuccessfully) to argue with some folks on the digitalmars.d newsgroup that the new 2.0 D language should not worry much about providing support for custom allocation schemes. D is designed to help productivity: it relieves programmers from the grueling tasks of managing memory, and it encourages good coding practices, but a systems language it is not. We already have C, C++ and assembly languages to do that low-level tweak when we need it.

Sadly, some of the people involved with the design of D 2.0 are aiming for the moral absolute, rather than focus on shipping something that works well enough. I think it is a bad decision to allow for mixing the managed and unmanaged memory paradigms; it is even worse that there are no separate pointer types to disambiguate between GC-ed and explicitly-managed objects. C++ went that route in its first incarnation, and it wasn't long before people realized that it was really hard to keep track of what objects live on the garbage collected heap and what objects are explicitly managed. A new pointer type had to invented (the one denoted by the caret) to solve the problem.

If folks really want to use D to program OS-es and embedded devices and rewrite the code that controls the breaks in their cars, they should at least make a separate D dialect and name it for what it is, Systems D, Embedded D or something like that. The garbage collection and other non-WYSIWYG features should be stripped out from such a dialect.

The ambition of making D 2.0 an all encompassing, moral absolute language may cause 2.0 to never ship, never mind get wide adoption. Perl started out with more modest goals and ended up enjoying a huge mind share.

So the ship-it today if it works hacker in me feels like dedicating a song to the language purists:
We don't need no education
Perl's best language of them all,
We don't need no education
Thanks a bunch to Larry Wall!

7 comments:

Anonymous said...

I agree completely. Use the right tool for the job. You don't mercilessly hack away at the tool until it does what you want it to, although effective in the short term, long term use can often yield unpredictable results. That's the very reason new languages get created, the tool to do what they want, how they want, doesn't exist: so they make it a reality.

When it comes down to it, interoperability between languages is probably one of the better solutions for this. Where you want the power to leverage system level tasks, the suggestion towards a Systems D would be preferable, but when you're working on individual user applications, you'd want a better safety on the language, so you don't 'shoot yourself in the foot' on something a lot of people really use, and it breaking won't be system crippling.

Interoperability comes into play when you need solutions to problems that need to play in both worlds, a managed or GC'ed architecture might be appropriate towards game development, but the engine and other aspects are better suited towards a language where you can leverage every inch you can muster. In cases where libraries are used by both, a level between the two can be made where they can both exist, so one can talk to the other.

It sounds like your associates at digital mars are forgetting one of the basics of programming: the perfect language doesn't exist, should it be so, we would all use it. There's reasons for the different degrees of separation in languages. Let's hope you can bring that to their eyes and prevent them from making an 'all in wonder' that is so convoluted, due to syntactical flair individual to describing each feature they decide would be cool, that no one uses it; thus, it becomes yet another language for the 'cool, but unwieldy' heap.

Cristache said...

Thanks for supporting my views. For the record: I do not work for Digitalmars, I do this project on my own in the after hours, so that I can learn more about the CLR and .NET

Anonymous said...

Right, I figured as much; however, I was merely meaning associates in the friendly, or you 'know them', sort of the word, less the business sense of the word.

Anonymous said...

I can't disagree with you more. The more I deal with C++, the more I want I want to use a language that makes things a little easier on me.

I'm a game programmer, and I'm coming to the conclusion that C++ really isn't a great fit for the problem domain (although it definitely beats C.)

D directly addresses most of the issues I have with C++ while letting me do the same tasks - tasks which require things like a custom allocator.

C++ was never really meant to be a great applications language, its no wonder D is much better in that area, but I don't see a good reason *why* it can't be as flexible as C++. If you made the argument that custom allocators some how were hurting other language features, then maybe I could see your point.

Cristache said...

Anonymous, you may have a point and I would like to understand your argument better.

Could you please expand on what motivates your need for a custom allocator? I presume it is performance; if so, do you have any numbers that show the impact of a specialized memory allocator over the default, built-in behavior?

Anonymous said...

It's not just performance, its also dealing with a small fixed amount of memory.

For example, If your writing for a game console, you don't have the luxury of virtual memory, which means fragmenting the heap is a very serious concern, because if an allocation fails to due to fragmentation, the game crashes.

If you can write your own heap, which you access through a custom allocator, you can allocate memory in fixed size blocks which, or use other strategies which, I'm not sure I can go into here, to avoid fragmentation. You can also have multiple heaps with different block sizes for different types data.

And yes, there is a performance benefit as well. I haven't actually done much work with my company's memory allocation (other deciding what type to use for what I'm writing) but I've been told that the console OS's allocation can be quite slow.

Cristache said...

Anonymous, I think that the best solution for the problem space that you are describing is to replace the memory allocator in the run-time libraries for that specific platform (this is the moral equivalent of providing your own version of malloc for a C/C++ application).

This I have nothing against. My issue is with having the operator new overridden in select classes (or class hierarchies) resulting in an application where some objects live on the default, garbage-collected heap, but others are instantiated on a user-managed heap.

Your use case is very clean and simple: you want all allocations to go through a game console specific allocator; this should not even be the game developer's concern, it should be handled by the D implementation for that OS.

I believe the Tango library system allows for pluggable allocation strategies, so that the GC mechanism can be bypassed, but I believe that it is an all-or-nothing deal in the sense that you can't have some classes use allocator A and some other allocator B. I'll have to refresh my memory on that topic.