Friday, September 28, 2007

What Would Fake Steve Do

Microsoft has been going hard after the search and advertising business for a while now, and they are getting more and more serious. Ed: see here.

I have not tried the features of Live Spaces yet, but it surely looks nice. Ed: Importing contacts from gmail into live spaces requires some gymnastics (maybe I will post about it later).

Google is firing back with their free applications, trying to get the Borg to duck. As Joel Spolsky explains,
"in infantry battles [...], there is only one strategy: Fire and Motion. You move towards the enemy while firing your weapon. The firing forces him to keep his head down so he can't fire at you"

I cannot tell why, but for the last couple of months I have been having this strange fantasy: I imagine that I am Steve Ballmer, and I scheme to bury big G.

Here are some ideas that I came up with:

  • give away infinite storage space in Hotmail, ad-free

  • ad blockers built right into Internet Explorer

  • use other apps (that Microsoft controls) as advertising channels (how about 3D ads in Halo?)

If somehow the boys in Redmond managed to do a better job at blocking spam, and gave Hotmail to everyone, ad-free, then you can kiss gmail good-bye. Of course, anything that is ad-free (such as all content displayed by IE, should it block all ads, period) would also cut into Microsoft's own advertising revenue channels. But they do not have to do it forever; they have other revenue streams, so they just have to block ads only long enough to put G to rest. Sure, it would be really sweet to block just the competitors' ads, but something in my gut is telling me that will not fly well with the other big G (ze Government).

And what about Firefox, then? Well, if 80% or so of the market sets a no-ads standard, Mozilla will have to follow suit. But who cares anyway about a browser that is being embraced by freetards, who are not likely to click on ads and buy stuff anyway (as pointed out by this blog, free software is for poor people).

I think 3D games are excellent contenders to web browsers for showing advertising, it would not be much different from real life. Picture yourself playing a shoot-em up match in a decor resembling Times Square. The only difference is that the content of the billboards (in the game) would be controlled by the mother ship. Ed: I was told today that it is already being done. That shows how much I know about video games.

The main problem may be getting businesses that are used to advertising in the browser space to transition to other media (game consoles, mobile devices, etc.) But in the long run, it will force the googlers to develop more client apps (so that they can show ads in them, rather than inside of the browser).

So it is all about re framing the game. If Google wants to distract Microsoft from the software business, fine: Redmond should create diversions that force Mountain View away from search.

It is an arms race, and I hope the Soviets will loose.

Ubuntu: Sex Sells

I cannot believe this. Shocking! How low is Ubuntu going to stoop? Edit: More tragically, I cannot believe some people flamed me for using this picture that "objectifies women"... Gee, wow!

Convinced that these stories of shameless sexploitation are but a dream of FSJ, I went and conducted my own independent investigation.

What I have dug out is unbelievable. (Get more offended by other outrageous stuff here and here).

Solid products should not rely on sex to sell... Oh wait. Ubuntu is free, there's nothing to sell to begin with. So what are they saying then? That Linux is for boobs, Apple is for asses?

As The Monty Python would put it, my nipples explode with delight!

Sunday, September 23, 2007

Boost Your Python

One of the many Good Things that entailed my going on to college was that I personally met some of the most brilliant individuals of my generation, and that I was influenced by them.

In the early 90's Sorin Surdu Bob pushed me to learn C and, as I started humming Let it Be, assembly language hacking was out and Hello World was in.

By a symmetrical twist of fate, in the late 90's I got to work on a couple of projects with Andrei Alexandrescu, who kicked me out of my C preprocessor habits and thought me the noble art of C++ templates.

I have been using C and C++ for a quite some time now, taking on jobs where performance was so critical that no other language would've fit the bill. Ranging from code for portable mp3 players to handling thousands of e-commerce transactions per second, none of my projects could've been done in an interpreted language such as say, Python.

But in the last couple of years I found myself in need for developing a quick prototype (of a graphical user interface). I made the mistake in the past to develop a complete user interface in C++, using gtkmm. It took a long time, and it was boring. Besides, the speed of C++ was not required, but something to speed up the development would've been appreciated. Of course, there's Glade, but I set out to see if I can do even better.

So I went shopping for a language to allow me to build a prototype fast, and hopefully have some fun while at it.

The main application was already written, about 60 000 lines of C++ or so at the time, all that I needed was to redesign the UI. Whatever language I was going to settle on, it had to play well with C++.

Enter Python: a fun, no-nonsense object-oriented programming language with good support libraries for Gtk and Glade. And there is a library in boost that makes integration with C++ a breeze.

I would like to share with you the wonderful experience I had hybrid-programming in C++ and Python.

Imagine that you have an e-commerce application, written in C++ (why in C++ is out-of-scope here, imagine that some else developed it, and that was their best decision at the time when they wrote it). One of the subsystems of this application deals with users, and you would like to quickly add some scripting capabilities to it.
The scripting feature would allow users to quickly extend the functionality of the main application. For example: someone may want to write a script that prints a report of all the newly added users; or write a graphical interface for administering the users in the system; and so on.

Say the central artifact in the subsystem is the User class:

#ifndef USER_CLASS_DEFINED
#define USER_CLASS_DEFINED

#include <string>

class User
{
public:
explicit User(const std::string& email);
virtual ~User();

const char* email() const { return email_.c_str(); }
void set_email(const std::string& email) { email_ = email; }

//
// etc
//
private:
std::string email_; // use email account to login
std::string encryptedPassword_;
};
#endif // USER_CLASS_DEFINED


Here's all the C++ code that you need in order to export the User class to Python:


// file ecommerce.cpp
#include <boost/python.hpp>
#include "user.h"

using namespace std;
using namespace boost:python;

// ecommerce is the name of the Python module that interfaces
// to the e commerce system -- you can name it whatever makes
// most sense to you
BOOST_PYTHON_MODULE(ecommerce)
{
// init<string> simply means that the constructor
// takes a string as it's parameter

class_<User>("User", init<string>())
.def("email", &User::email)
.def("set_email", &User::set_email)
;
}

Done! Now before you go off and try this at home, there is one detail that needs to be flushed out.


Embedding Versus Extending

There are two ways that C/C++ functionality can be made visible to Python. Which one you choose depends on the legacy C++ code that you have in place. If the C++ code is structured as a set of dynamic libraries (aka shared objects) then you can extend Python by building the above sample into a module, say libecommerce.so.

But if the system is a big monolithic blob, that needs to be run as a standalone application, then you need to embed the Python interpreter in it.

The first approach of extending should be preferred, because then you can combine your module with other Python modules and attain a higher level of versatility.

All you need to do is build your module with a command line like this:

gcc ecommerce.cpp -lboost_python -shared -o libecommerce.so

As you probably figured out, you need the boost_python library. Good news is that you may not even have to build boost_python, rather install the boost-devel package which is readily available for most main-stream Linux distributions. At any rate, detailed instructions on how to download and install boost are available at http://www.boost.org.

Once the module is built, you may import it into Python the usual way:

>>> import ecommerce

and access user objects like this:

>>> user = ecommerce.User('nobody@nowhere.org')
>>> user.get_email()
'nobody@nowhere.org'

If you need to embed rather than extend Python, you need to add this code somewhere in your existing C++ program:

#include <stdexcept>
#include <string>
#include <stdio.h>
#include <boost/python.hpp>

using namespace std;
using namespace boost::python;

bool run_pyhon_script(const string& filename, int argc, char* argv[])
{
FILE* fp = NULL;
bool success = true;

Py_Initialize();

try
{
if (PyImport_AppendInittab("ecommerce", initecommerce) == -1)
{
throw runtime_error(
"could not register module __ecommerce__");
}
object mainModule = object(handle<>(borrowed(PyImport_AddModule("__main__"))));

object mainNamespace = mainModule.attr("__dict__");

PySys_SetArgv(argc, argv);

fp = fopen(filename.c_str(), "r");
if (!fp)
{
throw runtime_error(filename + ": " + strerror(errno));
}

PyRun_File(fp, filename.c_str(), Py_file_input, mainNamespace.ptr(), mainNamespace.ptr());
}
catch (const exception& e)
{
fprintf(stderr, "Exception caught: %s\n", e.what());
success = false;
}
if (fp)
fclose(fp);
Py_Finalize();
return success;
}

Inheritance

Say the User class presented above has a derived class, for example GroupAdmin:

class GroupAdmin : public User
{
private:
int groupID_;

public:
GroupAdmin(const string& email, int groupID)
: User(email), groupID_(groupID)
{ }

int get_group_id() const { return groupID_; }
//
// etc
//
};

GroupAdmin already is also a User, and it would be nice to preserve the relationship in Python as well.

BOOST_PYTHON_MODULE(ecommerce)
{
class_<User>("User", init<string>())
.def("email", &User::email)
.def("set_email", &User::set_email)
;

class_<GroupAdmin, bases<User> >("GroupAdmin", init<string, int>())
.def("get_group_id", &GroupAdmin::get_group_id))
;

}

Voila. GroupAdmin auto-magically inherits the methods of User when exposed to Python. You can now write extensions to your e-commerce system, manipulating User and GroupAdmin objects in Python.

More Advanced Features

The Boost Python library has built-in support for exposing standard STL containers to Python, and support for handling smart pointers.

Let's imagine that there is a static method inside the User class, that queries user objects by the domain part of their email, like this:

class User {
// ...
static std::vector<boost::shared_ptr<User> > load_users(const string& domain);
};

As you can see, load_users returns a vector of smart pointers to User objects, rather than a vector of Users. This design minimizes the overhead of copying User objects around.

It takes three steps in order to expose this method to Python scripts:

Register the smart pointer to User objects:

BOOST_PYTHON_MODULE {
// ...
register_ptr_to_python<shared_ptr<User> >();

Second step, expose the vector:
// you need to include:
// #include <boost/python/suite/indexing/vector_indexing_suite.hpp>
class_<vector<shared_ptr<User> > >("UserVec")
.def(vector_indexing_suite<vector<shared_ptr<User> > true>())
;

Finally, expose the method itself as a standalone function:

def("load_users",
&User::load_users,
"load users by email domain" // documentation string
);

You can see the techniques described above used to hide the complexity of a C++ debugger here.

C++ is not always the best tool for the job. Thinking hybrid may save you many hours of tedious coding.

Monday, September 17, 2007

Debugger Tip #1: Leaner Binaries

Suppose that you are building a C or C++ Linux program that is going to be installed on tens or hundreds of your production machines. Since this software is not shipped to customers, you may as well leave the debug information in, to help you later with troubleshooting.

For complex programs the size of the debug information (especially for C++ programs) may be considerable, and it may impact your deployment time.

Hopefully you will not need the debug symbols as often. What if you could store the debug information on only one server instead of N?

Turns out you can pull this trick easily with the following bash script (which you can include in your Makefile as a post-build step):

#! /bin/bash
DBGFILE=DebugInfoServerNetworkMountedPath/$1.dbg
if objcopy --only-keep-debug $1 $DBGFILE; then
#strip -d $1 # strip debug info, or strip everything:
strip $1
objcopy --add-gnu-debuglink=$DBGFILE $1
fi

That's it.

"But how is the debugger going to know how to locate the debug information, since we stripped it out?" one may ask.

Simple. The objcopy --add-gnu-debuglink step creates a special section inside the ELF executable, which will point to the (network) location of the debug information. Both GDB and ZeroBUGS know how to handle it transparently.

Wednesday, September 12, 2007

Has RMS Gee-Pee-L-ed Himself?

It just stroke me today: the Internet seems awash in Fake Secret Diaries and Blogs.

There is a Fake Steve Jobs, and look: a fake Billy G! We even have a (grin) fake Ballmer (throws fake chairs, and occasionally, some stool).

But guess what. As of today, there is no Fake Diary of Richard Stallman! Wow. Is this because the guy is irrelevant and / or not funny? Shame on you if that's what you think.

My guess is that there is no fake RMS out there because he has placed his rotund image under the GPL (v3).

Someone must have told him "Hey Dick, go GPL yourself".

Saturday, September 08, 2007

It's a Sin!

I am spending some of my spare time these days revisiting algorithms: the computer scientists' bread and butter. (Which we tend not to use explicitly in our daily work, since most useful algorithms are part of one standard library or another).

But I am secretly hoping to lure my son (when the time comes) into sciences by showing how rather abstract stuff such as "the shortest path in a weighted, directed graph" applies to computer games, artificial intelligence, and whatnot. So I sat down last night to play with the Dijkstra algorithm, and I wrote this short Python program.

It was most frustrating that I spent one hour working on the bulk of the program, and three hours on solving the trigonometry problem of drawing those little arrow bitches at the end of the graph edges. The F-word came up some many times, I was happy that my son was asleep.

But then again, being five months of age he does not know what that means anyway.

"Let Daddy show you how easy trigonometry is. It's all about f***ing sin and cos".

Tuesday, September 04, 2007

Studying D Programming Language...

I have recently decided to re-hash (sic) my algorithms, and to study the D Programming Language at the same time. Below you can see a cool heapsort implementation.

As I used the ZeroBUGS debugger to step through the code, I have noticed that the DMD compiler does not generate debugging info for the template parameters (a and b in swap, a in siftdown, and so on). I hope this glitch (and other related bugs) will be fixed before the next D Programming Language conference.


import std.stdio;

void swap(T)(inout T a, inout T b)
{
scope tmp = a;
a = b;
b = tmp;
}

void siftdown(T)(inout T a, int begin, uint end)
{
uint root = begin;
while (root * 2 + 1 <= end)
{
scope child = root * 2 + 1;
if (child < end && a[child + 1] > a[child])
{
++child;
}
if (a[root] < a[child])
{
swap(a[root], a[child]);
root = child;
}
else
{
break;
}
}
}

void heapify(T)(inout T a, uint length)
{
int start = length / 2 + 1;

while (start >= 0)
{
siftdown(a, start, length - 1);
--start;
}
}

void heapsort(T)(inout T a)
{
uint count = a.length;
heapify(a, count);

--count;
while (count > 0)
{
swap(a[0], a[count]);
--count;
siftdown(a, 0, count);
}
}

void main()
{
long[] a = [ 5, 4, 1, 2, 100, 10, 42, 5, 10 ];
writefln(a);

writefln("----- heapsort -----");
heapsort(a);
writefln(a);
}