Hungarian Ducks
Posted by Rick DeNatale Mon, 09 Apr 2007 21:07:00 GMT
It’s interesting how events flow sometimes.
One of my major interests outside of software development is human spaceflight. Today I opened the local paper to the coming events section and discovered that the latest ‘rich cosmonaut’ to buy a trip up to the international space station, was arriving there today, and that his name was Charles Simonyi. As I’m writing this, he has just become the first Hungarian in space, or at least that’s what I heard him say over NASA TV.
Update: Actually that honor went to Bertalan Farkas who flew on Soyuz 36 in May of 1980.
Now that’s a familiar name. So I spent a little time this morning looking at his article on Wikipedia, and following some links. Thereby hangs an interesting tale which relates to duck-typing.
Charles Simonyi
Simonyi was the software architect at the Microsoft Applications Division back in the early days of Excel and Word. Before coming to Microsoft he was at Xerox Parc, at the same time as folks like Alan Kay, Butler Lampson, Robert Metcalfe, and Dan Ingalls. One of his most famous inventions was so-called Hungarian Notation, a way of encoding ‘type’ information into the names of variables. This became very popular among C programmers following the Microsoft programming bibles.
Now I never much cared for Hungarian Notation (which I’ll refer to simply as Hungarian for the rest of this article). As normally practiced it led to variable names which looked like they’d grabbed onto part of the C compiler’s symbol table entry. The variables would look like iX, and iY (for two different integer variables), or something like pudwGorp for a pointer to an unsigned doubleword.
Eventually Hungarian fell out of favor as more and more programmers realized that it didn’t contribute enough to pull its weight in ugly names.
Hungarian Dialects
What I learned today was that this form of Hungarian is actually a corruption of Simonyi’s original Hungarian which, instead of encoding the machine representation of the variable, encodes the intentional use of the variable. The Hungarian which most people saw was that used by the Microsoft Systems Division. Those in the know refer to this as Systems Hungarian to distinguish it from Simonyi’s original notation which is known by insiders as Apps Hungarian.
Here, from Simonyi’s original explanation of Hungarian Notation is how variables (or as he called them quantities) should be named:
- Quantities are named by their type possibly followed by a qualifier. A convenient (and legal) punctuation is recommended to separate the type and qualifier part of a name. (In C, we use a capital initial for the qualifier as in rowFirst: row is the type; First is the qualifier.)
- Qualifiers distinguish quantities that are of the same type and that exist within the same naming context. Note that contexts may include the whole system, a block, a procedure, or a data structure (for fields), depending on the programming environment. If one of the “standard qualifiers” is applicable, it should be used. Otherwise, the programmer can choose the qualifier. The choice should be simple to make, because the qualifier needs to be unique only within the type and within the scope—a set that is expected to be small in most cases. In rare instances more than one qualifier may appear in a name. Standard qualifiers and their associated semantics are listed below. An example is worthwhile: rowLast is a type row value; that is, the last element in an interval. The definition of Last states that the interval is “closed”; that is, a loop through the interval should include rowLast as its last value.
- Simple types are named by short tags that are chosen by the programmer. The recommendation that the tags be small is startling to many programmers. The essential reason for short tags is to make the implementation of rule 4 realistic. Other reasons are listed below.
- Names of constructed types should be constructed from the names of the constituent types. A number of standard schemes for constructing pointer, array, and different types exist. Other constructions may be defined as required. For example, the prefix p is used to construct pointers. prowLast is then the name of a particular pointer to a row type value that defines the end of a closed interval. The standard type constructions are also listed below.
State Your Intentions
Note that Simonyi here is really talking about the intentional use of the variable rather than it’s representation as a bag of bits.
While this isn’t really a duck-typing system, it relates in that it describes the usage of a variable in priority to the machine representation. Since the host language here is C, the notation still has to talk about things like pointers, and whether or not the pointer is being used as an array.
So in Apps Hungarian instead of iX, and iY, those variables would likely just be x and y if they were being used for x and y positions, or row and col for spreadsheet coordinates. Other names might be wrx for an x position relative to the window coordinate system, or wrxWidget1 for a particular window relative x coordinate.
And the two dialects are incompatible. As explained by Rick Schaut:
For those of us who have grown accustomed to Simonyi-style HN, the whole “i” means “int” prefix thing is doubly horrible, because the “i” prefix in Simonyi-style HN is commonly understood to be an index into an array. If I put a character into an int, I’m not going to call it iCh. I’m just going to call it ch. To me, an ich is an index to a character in an array of characters.
In comparison to System Hungarian, this seems to be much more useful information than a shorthand encoding of the underlying C datatype.
This schism between the two dialects came about because Simonyi used the word type where a better word such as usage or role would have better represented the idea. Perhaps this came from the fact that Simonyi’s native language was, not surprisingly, Hungarian rather than English. It was actually the technical writers in the Systems Division who created Systems Hungarian by editing earlier documentation of the concept without fully understanding it.
It seems to me that this is the same confusion which confounds so many discussions of duck-typing in dynamic languages.
Further Reading
Another Microsoft document on Hungarian naming conventions by Doug Klunder.
More explanations of the difference between App Hungarian and System Hungarian have been given byLarry Osterman and Rick Schaut
Joel Spoelsky also wrote about this. Although I often disagree with Joel on specific topics, he makes some interesting points. One thing that struck me in this article was his use of Hungarian prefixes ’s’, and ‘us’ to represent safe and unsafe variables in a web application to indicate which variables should be distrusted and escaped before being used to render html. On the surface this seems like a nice idea if applied consistently, although I’d still want to make use of mechanisms like Ruby’s unsafe tainting of objects to give real safety.
You see I just don’t think that source-code time checking is powerful enough to trust it completely










As an avid Microsoft fan, computer programmer and space aficionado…naturally I have been following the recent pursuits of billionaire Dr. Charles Simonyi, one of my greatest techie idols. For those of you who do not know, Charles Simonyi is one of the founding fathers of Microsoft and the author of Microsoft office. He lives on a 90 million dollar yacht, and seems to date the sexy lifestyle guru Martha Stewart. Recently, Charles paid the Russian Space Agency a rumored 25 million dollars to become the world’s 5th space tourist. I was curious to watch the build up to this event. This is when I realized that Charles had made a serious blunder on his way to race to the stars. He hired an opportunistic, phony by the name of Susan Hutchinson to run his foundation and behave as his spokesperson. Why might you ask would I care? Well, after listening to her for 2 seconds and seeing her picture, I became disappointed with the fact that Dr. Simonyi, a brilliant scientist, would higher a non-nerdy person to speak for him, I doubt this lady can even turn on a computer. Delving into her background, I uncovered that previously, she had been a local Seattle newscaster (I have a feeling she was a Weather Girl). She somehow found herself in politics recently and began exploring running for Senate. It seems that the voters of Washington State dismissed her in an instant and caller her out as an “Uninformed candidate” (another term for “Moron”). If I have to listen to this women speak on his behalf again, I will stop following his journey. I can’t sit back and watch an unintelligent person who has no idea what a CD-ROM drive is, let alone space travel try to pretend that she has a clue. Dr. Simonyi, stick with your own kind! Be Proud! Next time, higher a geek.
Not much on ducks, but interesting example of MS completely changing anything they touch but not for any obvious reasons other than to make it look like MS had been there.
I’ve tried a pseudo-Hungarian notation. When I was studying C (oh the pain). In the end, it really did just make a bunch of useless sigils that decreased code legibility. Longer, descriptive names work better, ducks or no.
I’m liking the ducks in Ruby. The PHP way was almost pointlessly marking all variables with $ while still ducking. Perl has a bit of the too much sigil action too. Ruby is freeing me from too many sigils.
@Celeb Geeker Wow, an elitist nerd who can’t spell to save his life and actually likes MS (wtf?!). Next time, hire a geek who doesn’t proclaim his or her own intelligence.