Skip to content
May 15, 2012

Loebner Contest makes Turing Roll in his Grave

I got up early this morning to watch a live streaming of the Loebner Prize contest, a formal enactment of a Turing test where human judges must hold chat conversations with humans and chat bots simultaneously, and try to determine which one is the human and which one is the bot. Hosted at Bletchley Park this year, the contest was divided into four rounds where a total of 14 entries were to be judged.

Unfortunately, there were several technical issues. The Loebner website was poorly laid out and I had a hard time finding the live streaming URL to begin with. Once I got there, I found the web streaming interface was rather ineffective. Conversations between one judge, one human and one bot were divided into panels of four unlabeled windows where characters appeared as they were being typed, making it difficult at first glance to see what part of which conversation was happening where.

There were also connection and text formatting issues, with some text never appearing, connections being broken, and some chat bots never producing newlines, which gave them away pretty quickly. Apparently, the people behind the Loebner contest chose to design their own chat protocol, probably not a wise choice. They would likely have been much better served by a private IRC server.

The bigger problem, of course, is that most of the entries were absolutely terrible. It seems that none of the judges were “fooled” for more than a minute. Most bots were completely unable to follow a conversation in any way. Some kept asking random questions over and over. Most couldn’t answer any questions from the judges, to the point where some judges decided to simply ignore the bots. It’s rather appalling to see that even today, in 2012, several of these bots do worse than plain old Eliza. I’m fully aware that programming a competent chat bot is no easy feat, but it seems like this competition simply hasn’t attracted the cream of the crop.

After the thrill of watching Watson spank humanity’s ass on Jeoparty, I’m convinced that with modern technology we could do much better than this, if only someone just tried. I wish that the people at Cycorp would try and program a chat bot and show us what they can do with that fancy knowledge base and NLP software of theirs. I guess the truth is that there is too little market value for good chat bots, and the people who have the knowledge to implement a serious contender are busy doing something that earns them money. Turing is rolling in his grave right now.

May 8, 2012

We Should be DNA Sequencing Endangered Species

Many animal and plant species alive today are at risk of soon becoming extinct. Global warming, excessive hunting and destruction of natural habitats are largely to blame for this. Some efforts are already underway to save some of these species, but sadly, it may be too little too late. Taking into consideration the current state of political affairs, it seems very unlikely we’ll be able to stop global warming within the next 20 to 40 years, and by then, many species will be going extinct. Some have already disappeared. This is a sad and likely irreversible loss to the biodiversity of this planet.

It seems that, at this point in time, there isn’t enough good will to save the many species that will likely become extinct soon. Doing so would require fundamental changes to the way our economic and political systems function, the kind of changes that likely won’t occur during this century. It’s unlikely we’ll be able to protect the natural habitats of all endangered species, and not all of them can be kept in a zoo, but I still believe we should be doing what we can to save these species anyway.

Genetic sequencing is becoming increasingly affordable and will soon even be within the reach of average citizens.  As such, it may be wise to begin efforts to sequence the DNA of endangered animals and plants. Doing so would allow scientists to study these animals, but it would also ensure that even if all members of the said species were to go extinct, something of that species would survive. It’s very likely that in the not-so-distant future, we may have the technology to bring extinct species back to life based purely on DNA sequences.

Practically speaking, this may be the simplest, safest and most economically realistic way to preserve endangered species. All that is required would be enough budget to allow scientists to sample the DNA of a few representatives of endangered species and digitally publish this material.

April 23, 2012

MusicToy – Music Made Simple

This weekend, I have been working on MusicToy, a minimalist grid sequencer app. This is a very basic music editor based on an editable grid, with each grid cell corresponding to a specific note or sample being played at a given time. It has drum samples and notes in the minor pentatonic scale. The idea was to make a tool that would encourage musical experimentation with no prior musical knowledge required, so that anyone can use the tool, no training required. The pentatonic scale is easy to compose with, as most sequences of notes in the scale sound musical. It’s been used in children’s songs, pan flute music and Darude’s sandstorm, among other things.

Interesting features include the ability to edit patterns as they are looping and the possibility of sharing links to patterns you create. MusicToy runs 100% on the client-side, and as such, it isn’t possible to save patterns on the server side. However, I discovered that it’s actually possible to generate a unique URL for each pattern by modifying the anchor tag part of the address on the fly to encode the said patterns. This is a wonderful trick I will surely use again in future web apps I make!

The app is implemented using JavaScript/HTML5. It uses the canvas element and the Web Audio API. It should work in recent versions of Chrome and possibly Safari (let me know).  Go ahead and try playing with MusicToy or check out some patterns made using it:

Example 1
Example 2
Example 3
Example 4
Example 5
Example 6
Example 7
Example 8
Example 9

Feedback about bugs and possible improvements is welcome :)

April 20, 2012

The Indirection Problem

In this post I’m going to talk about a performance disadvantage inherent to most (if not all) current implementations of dynamic languages and, more generally, languages that make use of garbage collection. This problem affects JavaScript, but also Python, Lua, Scheme and even Java to some extent. Surprisingly, this performance issue is one that is seldom talked about. When you ask someone why JavaScript code is slower than C, they will mention dynamic typing, run-time checks, number types and other sources of overhead, but they will seldom mention indirection.

What do I mean by indirection? Why is it a problem? In JavaScript (and most other dynamic languages), objects, arrays, functions and strings are manipulated through references values. Concretely speaking, all JavaScript objects are allocated on the heap and manipulated through pointers. This means that accessing object properties necessarily means going through one or more levels of indirection. You might think that this isn’t such a big problem. Most objects in C/C++ are also heap-allocated, after all. There is one big difference, however: in C and C++, the programmer can intentionally nest (inline) objects inside of each other. In JavaScript, this cannot be done manually.

As a simple example, suppose you wanted to define and instantiate an object representing a car and its 4 wheels, with each wheel having angles of direction and rotation. In C, you could do the following:

struct Car
{
    char makeModel[128]; // String for the car's make and model
    struct Wheel
    {
        double direction; // Wheel direction angle
        double rotation;  // Current angle of rotation
    } wheels[4];
} car;

In JavaScript, this could be done as follows:

// Car object literal
var car =
{
    makeModel: 'make and model string',
    wheels: [
        { direction:0, rotation:0},
        { direction:0, rotation:0},
        { direction:0, rotation:0},
        { direction:0, rotation:0},
    ]
}

Both of these accomplish the same goal. In either language, one can write car.wheels[0].direction = 0.32, for example. In practice, there are several differences as to how this will be implemented, however:

  1. In C, the car object is explicitly allocated on the stack. In JavaScript, it is implicitly allocated on the heap.
  2. The C object, including the wheel objects and the rotation angles, is one contiguous piece of memory. This is not the case in JavaScript. Modern JS engines will allocate a car object, a string object, an array object to store the wheels, and individual objects for each wheel. The JS car object only contains pointers.
  3. In C, the rotation angles are contained within the wheel objects, in the same contiguous piece of memory.  Under Google V8, this would not be so, the rotation angles, if they are floating-point, would each be stored in objects of their own.

These differences have a performance impact. In C, accessing a wheel rotation angle only requires accessing the stack pointer at some fixed offset. There is only one level of indirection. In JavaScript, there is indirection from the car variable to the car object, from the car object to the wheel array, from the wheel array to the wheel objects, and potentially from the wheel objects to the rotation angles. Concretely, this means:

  1. Allocating all of the needed JavaScript objects will be slower. There may need to be multiple calls to the allocator. Object headers and pointers may need to be initialized.
  2. Accessing object properties will also be slower in JavaScript. This is not so relevant in this toy example, but you could imagine that if we were updating a list of car objects in a game engine loop there could have a significant impact on performance.
  3. The JavaScript representation will use more memory. In addition to storing the wheel objects, we have to store a header for these objects and pointers to the said wheels. Memory alignment constraints might impose additional memory waste as well.
  4. Because the JavaScript representation is larger, it will not fit in as few cache lines as the C representation, this can cause further performance losses when dealing with large sets of objects.
  5. In Google V8, incrementing a wheel rotation angle would cause a new floating-point object to be allocated.

I’m well aware that there are ways to deal with some of these issues in JavaScript. Perhaps the objects could be represented differently (we could have avoided the array). Perhaps we could have used the new Float64Array extension for rotation angles. The point is simply that often, the intuitive JavaScript representation performs worse due to the various reasons I’ve outlined. The programmer can do some things to recoup part of the performance, but this is often at the cost of readability. Ideally, one would want the (JIT) compiler to be able to mitigate or eliminate these performance issues without the code having to be changed.

One of the main reasons why I wrote this post is because I know that compilers can do more in this regard, and more would probably be done if compiler writers were more aware of these issues. The Java VM can do escape analysis and automatically allocate some objects on the stack when this is found to be safe. Papers have also been published, on the topic of automatic object inlining, that is, automatically allocating child objects inside of parent objects when possible. Such optimizations can help eliminate indirection and memory usage overhead. However, these techniques are not widespread in modern JIT compilers, and more research could be used to refine them.

April 11, 2012

Why LISP Never Took Off

lambda

LISP is a programming language designed by John McCarthy in 1958 while he was at MIT. It’s one of the oldest high-level programming languages. The name LISP is an acronym for LISt Processing, because linked lists are the core data structure of the language. After its inception, the language quickly gained traction in AI research circles because of its simplicity, its mathematical elegance and its expressiveness. Today, the term LISP usually refers to languages derived from LISP, the two most famous of those being Scheme and Common Lisp.

I’ve been exposed to Scheme and other LISP variants on several occasions during my studies. I will freely admit that I find LISP languages interesting and conceptually elegant. The Scheme philosophy is one of “less is more”. The language, at its core, is very simple, but provides powerful facilities for expanding the language itself, in the form of macros. Scheme is also very consistent and logical. Its design has clearly been given careful thought.

Scheme programmers seem to love the language. They praise its many qualities: its elegant syntax, its conciseness, its functional design, its expressiveness, the productivity gains it provides. It’s a language so powerful, it allows you to redefine the language itself through the use of macros. Scheme programmers love the language so much, they almost make it sound like strong AI would have been achieved years ago, if only everyone programmed in Scheme.

The question when it comes to Scheme and other LISP variants, however, is how come no LISP language ever really caught on? If everyone who gives the language any serious consideration loves it, how come so few people use it? Why are we stuck with so many flawed, inelegant, unnecessarily complex mainstream languages when there’s a much cleaner, elegant and expressive alternative right there in the form of Scheme?

Rudolf Winestock attempted to answer this question in his essay entitled The Lisp Curse. His answer: LISP is just too powerful for its own good. Its expressive power is its ultimate drawback. I don’t believe expressiveness is much of a downside. I agree with Winestock, however, that one of the real issues with LISP is the fragmentation of the community. In this post, I intend to look at some more concrete reasons why LISP languages, in particular Scheme, never caught on, despite their many strengths.

Fragmentation

When it comes to programming in C, there’s only a few big name compilers. GCC and Clang for Mac and Linux. Visual C++ on Windows. These are all fully-featured and competent compilers. Choosing a C compiler isn’t very difficult. They’re all rather good and usually rather straightforward to install and use. When it comes to programming in Scheme, however, the choice isn’t so clear. There’s many Scheme implementations out there: Bigloo, Larceny, Chicken Scheme, Guile, Gambit Scheme, MIT/GNU Scheme, Chez Scheme, Scheme48, Hop and more. None of these stands out as the ideal choice. They have various degrees of support for the language, widely varying performance, and installing some of these on your platform might well force you to take a stroll through dependency hell.

Newcomers might easily be scared away from the language before they even get to try it. They might try to find a Scheme implementation and find it’s unmaintained and broken, or that it’s a pain to install on their platform. They might also discover that the implementation they chose doesn’t fully support the latest Scheme specification, but rather some more-or-less conformant dialect of Scheme with undocumented features. You might think I’m just making up these scenarios, but they are actually part of my initial contact with Scheme. Although I haven’t tried Common Lisp myself, I believe it suffers from a similar problem.

Less is Not More

Scheme follows a minimalist design philosophy. It provides a small set of constructs as well as macros, and the programmer is expected to extend the language to fit his or her needs. In theory, this sounds very nice. A minimalist, “less is more” type of design should mean less room for incompatibilities among implementations. It should mean the language is easier to implement and perhaps help it remain more semantically consistent.

The problem is that Scheme provides perhaps too little. The C++ standard library provides you with standard containers, such as hash maps and binary trees, as well as algorithms operating over those containers, such as sorting algorithms. It also provides basic support for multithreading. The latest Scheme standard (R6RS) doesn’t specify any of this. It doesn’t even specify basic functions for operating on strings or even getting the integer code for a character value. These are very basic features that many, if not most, programmers will need at some point.

Most Scheme implementations have chosen to extend the language to compensate for what the standard lacks. The issue here, however, is that different implementations may implement these unspecified features in different ways. This causes further problems. Those who want to write code in Scheme will often be forced to use non-standard features specific to whichever implementation they are working with. Sharing code then becomes problematic, because other implementations may or may not support the features needed. The limitations of the Scheme standard make it difficult to implement portable Scheme libraries.

Batteries Not Included

One of the biggest frustrations I’ve had when trying to write Scheme code is the poor documentation. It’s not that easy to learn what you need to know about the language. Google searches will often yield irrelevant results, or results that are specific to one Scheme implementation. The official language specification document is probably the best source of documentation, which is rather sad. What if you need to use implementation-specific features? Well, the Gambit Scheme page has a long list of undocumented extensions.

Perhaps the Scheme community suffers from a lack of desire to document. From what I’ve seen, it’s fairly typical for Scheme code to have very few comments (if any) and be full of one-letter variable name. There seems to be a prevalent assumption that the code is self-evident and doesn’t need any explaining. This assumption is rather strange, because Scheme is probably the language that most needs documentation out there. In a language where you can use macros to (re)define language constructs, you should never assume that anything is self-evident.

The (Lack of) Speed

Scheme is dynamically typed. It’s very much a dynamic language. Just like Python and JavaScript, Scheme allows you to dynamically change the type of variables at run-time. This makes it rather tricky to compile Scheme code efficiently. There’s been lots of research in the past few decades on how to implement efficient JIT compilers for dynamic languages. From the work on SELF, to Mozilla’s work on Tracing JITs, to all the work done on PyPy, to all the work that’s been done in the Java world.

I’ve got bad news for you. There are no JIT compilers for Scheme. Okay, there may be some experimental Scheme JIT out there, but as far as I know, all the major Scheme compilers compile code ahead of time. This is sad, because JIT compilers are an ideal choice for dynamic language. They allow you to take advantage of type information that’s only available at run-time. The Scheme community seems to think it knows better, and has decided to stick with static compilation, disregarding the last 20 years of research into dynamic languages.

By default, most Scheme and Common Lisp use numerical towers to represent numbers with arbitrary precision. This makes it difficult to optimize Scheme code, because overflow checks need to be inserted into the generated code.  Non-integer numbers are represented by arbitrary precision rationals if possible, which make for better precision, but terrible performance. In order to get around this, special operators need to be used to operate on machine-precision integers and floating-point numbers. Scheme has taken the view that everyone should pay for the performance cost of arbitrary precision arithmetic, rather than only those who need it, because this is more mathematically elegant.

Elitism

Some Scheme programmers seem to believe that Scheme is God’s gift to the world of programming and that it’s the right tool for every use. Don’t get me wrong, I like many things about Scheme (and other LISPs), but I’m a pragmatist, and I just don’t find it all that convenient to use in practice. I also fail to understand why anyone would think Scheme is a good first language to teach a new programmer. It may be mathematically elegant, but it’s certainly not the most intuitive language out there, or the best choice for new programmers, when there’s alternatives like Python.

My first exposure to LISP was in a first year undergraduate class on programming language semantics. In this class, we had to do the classic experiment of implementing an interpreter for a simplistic LISP dialect in Scheme. Most of my classmates, who had little programming experience, were rather confused. I was mostly confused because I didn’t really get the point. In order to understand the interpreter we were writing, you needed to already understand the level of language semantics the class was teaching. In the end, the interpreter in question was trivially simplistic and predictably inefficient. They essentially taught us how you should never implement an interpreter, if you care about performance to any degree. Talk about unrewarding programmer masturbation.

April 2, 2012

Inefficient Numerical Operators

My Ph.D. work involves the design and implementation of a JavaScript JIT compiler. As a result, I’m quite familiar with the intricacies (and the many inefficiencies) of the language. The ECMAScript 5 specification specifies, among other thing, how each JavaScript primitive operator should be implemented. That is, it provides algorithms, in English prose, to specify the behavior of addition, subtraction, multiplication, division, and all the others. Unfortunately for JavaScript (and all its users), these algorithms are long, complex and unintuitive. The algorithms in question involve dynamic dispatch, conversions from strings to integers and other nasty surprises.

As a programmer, I truly wish JavaScript’s operators had simpler semantics. All JavaScript arithmetic operators can take any type as input, they will never throw an exception, and will possibly convert strings to integers. Any Vulcan would surely frown when realizing that in JavaScript:

'4' + '4' is '44'
But...
'4' * '4' is 16

null + 1 is 1
But...
undefined + 1 is NaN

{} + [] is 0
But...
[] + {} is NaN

The rules behind JavaScript’s operators are unintuitive and unnecessarily complex. Let’s all be honest here. There is absolutely no need for arithmetic on values that aren’t numbers, an exception should be thrown if this ever occurs, and JavaScript would have been better off with a dedicated string concatenation operator, as Perl has. However, lack of intuitiveness isn’t the only problem here.

As a compiler writer, and someone who studies type analyses (i.e.: type inference), I find that the bigger problem is inefficiency. More complex algorithms behind primitive operators means using these operators is, in general, more expensive. Dynamic type checks need to be added to handle multiple possible cases, which is inefficient. Some of these tests can be optimized away, but some can’t. We’re all paying a cost for JavaScript features we hardly ever use or need.

Furthermore, the fact that all JavaScript operators can take any type as input and never throw exceptions means that type inference of JavaScript is slightly harder. If JavaScript operators were restricted to numbers, a type inference engine could take advantage of the fact that an expression such as x + y would imply that x and y must be numbers. Hence, subsequent uses of those variables could be optimized. This is obviously not the case: in JavaScript, x + y doesn’t guarantee anything about x and y, because the programmer could be performing an addition on booleans or objects or arrays or strings or any other types.

Another gripe I have with JavaScript is that all numbers are assumed to be 64-bit floating-point values (doubles). Integer arithmetic is still faster than floating-point arithmetic, and so all modern JavaScript JITs, behind the scenes, will try to represent numbers as integers when possible, and fall back to floating-point values when an overflow occurs or a fractional value is needed. This incurs performance costs because it means much of JavaScript’s arithmetic operations need to include conditional tests checking for overflow. Again, some of these tests can be optimized away, but many can’t. In general, proving that numerical values won’t overflow in a type analysis only works in a limited number of cases.

Why did the creator of JavaScript make this choice? Probably because he presumed it would be simpler for new programmers. It’s a common belief that integer types, integer overflow and modular arithmetic in C are hard concepts to understand. Floating-point numbers are of course necessary, but they are certainly not easier to understand than integers. Integer values wrapping around on overflow might seem a little puzzling to some people, but I don’t see how dealing with NaN, infinities, negative zero, rounding and precision errors is more intuitive.

I believe that a better design would have been to have distinct integer and floating-point numerical types in the language. The better choice, for performance, would be to have integers operations be closed over integers. That is, adding, subtracting, multiplying and dividing integer values should only ever yield integers. If a conversion to floating-point is needed (which I believe is relatively rare), it should be done explicitly. Here are some examples of what this could look like:

var x = 1;   // Integer
var y = 2;   // Integer
var z = 4.0; // Floating-point

var v0 = x + y        // v0 === 3
var v1 = x + y + z;   // v1 === 7.0
var v2 = x / y        // v2 === 0
var v3 = x / z        // v3 === 0.25
var v4 = Float(x) + y // v4 === 3.0

Such a design would give programmers access to both 64-bit integers and 64-bit floating-point values while making it possible for a type inference analysis to eliminate most type checks on arithmetic operations. Some might claim this would make the language more confusing, but I don’t think this is true. This system wouldn’t be any more complex than Python or Scheme’s numerical tower. It would actually give programmers more control over what their code is doing, without requiring unwieldy type annotations.

The more I learn about JavaScript, the more I think I want to design my own dynamic language someday. I strongly believe that there is a way to design a dynamic language that is efficient, practical, and suitable for both low-level systems programming as well as rapid prototyping. In the end, it all comes down to compromises in the design of the language. In my opinion, simpler semantics often means more intuitive and easier to optimize.

March 30, 2012

Teaching Computers about Music

I recently took an interest in music. More specifically, algorithmic composition, that is, designing computer programs that can compose music. I made some attempts at implementing such programs, based on the idea that it should be possible to come up with a set of rules for generating music that sounds pleasant. What’s pleasant to one person might not be pleasant for everyone, but I thought I could easily conjure rules for generating music that was pleasant to me.

Unfortunately, the more I looked into music theory, the more problematic this became. There are some known, seemingly loose and imprecise, “rules” associated with classical music, but it seems to me these “rules” are in no way sufficient to derive an algorithm. Furthermore, I find that music theory largely fails to account for modern music. Some people might tell you that pop and dance music are trivial once you know about classical music. I would argue that this point of view is arrogant and uninformed. Modern electronic music has a much stronger focus on rhythm and unpitched sounds. It has integrated musical elements from non-european cultures and features instruments and composition techniques that didn’t exist before the 1960s.

A few days ago, I started thinking about ways to flip the algorithmic composition problem on its head. I’m unable, at this point, to write out elaborate rules to compose musical pieces by hand. It seems like an overly complex, tiresome process. I do, however, have my own musical tastes. If you show me a snippet of music, I can tell you whether it sounds musical or dischordant to me. My idea then, is to have an algorithm that composes random musical patterns, and to have a human judge tell the algorithm what sounds best.

I was able to prototype a simple version of this idea in about 6 hours of programming time. The program I implemented essentially “evolves” a simple musical grammar based on the user’s tastes and renders the sounds using a synthesized bass patch. My girlfriend and I both tried teaching the program about our tastes.

Here are some samples from a single run I did last night. The training took about 15 minutes:

Example 1
Example 2
Example 3

Here is one sample from a run supervised by my girlfriend:

Example 4

The first 3 examples sound similar because they are based on the same grammar derived during one run. The algorithm is capable of generating rather different sounds, however, based on the choices made during a given run and your own personal preferences.

So far, I find the  results very encouraging. I think there is clearly potential in the idea that someone can teach an algorithm how to compose based on their own tastes, even if they themselves do not have formal knowledge of music theory. I plan to explore this concept further. I will be trying to generate more complex music that integrates a drum kit and variable-length notes.

Follow

Get every new post delivered to your Inbox.

Join 249 other followers