You Can't Judge a Book... Some Mental Traps in Learning Ruby

Posted by Rick DeNatale Wed, 17 Oct 2007 17:01:00 GMT
Quick, in the Ruby expression:
a[10]

What's the class of a?

Many beginning, and intermediate, rubyists whould instinctively say Array! The real answer is that given just this code, you don't know.

Recently, on the rails-talk mailing list there was a discussion of this idea from Jamis Buck. Jamis uses alias_method in ActiveRecord::Base so that he can use shortcuts like:

Person[1]

as a shortcut for:

Person.find(1)

This is particularly nice when playing with ActiveRecord in the rails console, but many seem to think that this should be made part of Rails core, and I don't see a real objection.

Now one objection raised by the folks who answer Array to the opening question is that, since ActiveRecord rows can be deleted, Person[2] might fail even though Person[1], and Person[3] work. The reasoning is that the [] makes it "look like an array" and arrays don't act like that.

To me this is a case of judging a book by looking at it's cover. Something which I strongly suggest needs to be resisted in order to achieve ruby mastery.

Let's have a look at a couple of "can't see past the cover" traps in Ruby.

Trap Number 1: Confusing Calling Interface with Class

Seeing a[2] and thinking Array is an example of this. Just because 99% of the time when you see something "indexed" by an integer, that thing doesn't have to be an Array. It might be an Array, but it could be a Hash, or an ActiveRecord class, or anything which implemented some form of [] method. This is looking at the method invocation as a 'cover.'

Ruby method signatures are quite a bit more flexible or at least different from those of other languages, something which can trip up someone trying to see Ruby from the perspective of their 'native' language. Consider the varieties of parameters which can be passed to Array#[], or String#[], rather than just a indexing methods these can slice and dice the receiver in various ways

Although different classes might all implement a [] method, and while there are likely similarities or common subsets of functionality, each has it's own quirks. This can drive folks expecting type signatures nuts, but it's actually one of the things which gives Ruby it's power, and efficiency of expression.

Some folks go to great efforts trying to pin-down Ruby and make it 'behave' like other languages, but that's a fools errand which works against Rubys strengths. I'd recommend that these folks redirect their efforts towards improving their testing skills.

Trap Number 2: Ignoring the Contents

For better or worse, many Ruby classes provide functions which may not work for all instances. For example, the Range (1..10) is enumerable while (1.0..10.0) is not. This is because some of the methods in range expect the beginning and ending values of the range to implement methods such as #succ in order to determine the next value. Further they implicitly expect that applying #succ successively will get to the end value.

So does this mean that making the range (1.0..10.0) is ilicit or bad? Not if you don't need to enumerate it and only want to do things like check whether or not a value is included in the range.

Some folks on ruby-talk have expressed the opinion that the documentation for Range implies that (1.0-10.0) is not a valid range, because the scope of the requirement for elements to implement #succ is ambiguous. Pragmatically I find the ability to construct enumeration challenged ranges useful.

Further if you take the philosophy which leads to restricting ranges to elements which make the ranges enumerable consider these things:

%w(Now is the time).sort      # Works fine
[:fee, :fie, :foe, :fum].sort # Fails 
['fred', 42].sort             # Fails

Now do we really want to outlaw arrays of symbols, or heterogeneous arrays from Ruby usage because they don't sort?

What do you think?


Trackbacks

Use the following link to trackback from your own site:
http://talklikeaduck.denhaven2.com/trackbacks?article_id=472

Comments

  1. labria about 5 hours later:

    I do use the Person[] in the rails console, through .irbrc config, but I surely would not use it in the app itself. When someone looks through my code, he expects Name[] to be an constant array, defined somewhere else (I would!). I wouldn’t like to confuse people to save a bit of typing. Name.find() is 4 symbols longer, but much more readable.

  2. Jeremy 1 day later:

    The object that it’s too similar to array is sort of silly. You could just as easily have a Hash that has a numeric key…

  3. Bob 1 day later:

    “just because 99% of the time when you see something “indexed” by an integer, that thing doesn’t have to be an Array.”

    And just because 99% of the time turning the ignition switch in your car starts it, doesn’t mean you can’t have it turn on the radio every once in awhile.

    There is something to be said for clarity and consistency in an interface. Just because you can do it, doesn’t mean that you should.

  4. Pat Maddox 1 day later:

    Another brief example:

    foo = lambda {|a| a.succ}

    foo[“c”] => “d”

  5. Piers Cawley 1 day later:

    Personally make @Model#[]@ be something that would make @Model[params[:model_id]]@ return either the specified model or throw a RecordNotfound exception if no such model exists.

    Most of the time that’s going to be the same as find, but in cases like @/songs/my-sharona@, not so much.

    But then, I loathe URLs that expose primary keys.

  6. Ryan Bates 2 days later:

    I think you lose much of the convenience of the square brackets in a real app because you’re rarely accessing the id direcly. Instead it’s more like:

    Person[params[:id]]

    Nested square brackets, ick. I prefer the original way:

    Person.find(params[:id])

    Also, making it easier to search by id directly like this might encourage it more. So you’d see more code like this:

    Person[project.person_id]
    

    Instead of this:

    project.person
    

    Just a thought.

  7. jmoiron 3 days later:

    I think that there are things that are useful as a convenience (like $() from prototype) that, because of their ubiquity and their vastly improved readability, can bend the rules a bit.

    I strongly believe that abuse of the indexed/associative accessors rarely fall into this category.

    I think it’s important for an API in particular not to reveal half-done abstractions for normal use. As a shortcut for an interactive session; by all means. But if you are implementing an index accessor for use inside program text, then your object better mimic an array to the fullest extent possible; it better have a length, iteration, etc.

    If it doesn’t, then making your object look like an array is misleading, and you introduce more complexity by re-using that syntax than was originally present: your own example proves it.

    Person.find(1)
    

    What’s a person?

    Person[1]
    

    Now what’s a person?

    A reader of your program (or the writers using your API) will have to think “Oh, is @a[10]@ a list or an ActiveRecord object?” every time that indexed accessor is used, and it isn’t always obvious (because it isn’t really always important). If I am implementing methods like these, I usually stop and think: “Am I going to be able to treat this like the type it looks like without having to think about it’s implementation?” If the answer is no, I don’t do it.

    Writing an interface to functionality that doesn’t require you to think about the implementation of that interface when you use it is real expressiveness. Anything else is just “sugar abuse”.