Our Kind of Ducks, Odd Ducks, and Trained Ducks

Posted by Rick DeNatale Mon, 30 Apr 2007 20:01:00 GMT

A couple of days ago, someone asked on ruby-talk how to implement a cyclic version of Enumerable#each_cons. In words, he was looking for a method:

class Array
   def each_cycle(size, start=0)
      #insert implementation here
   end
end

# So that this:
%{a b c d e}.each_cycle(2) {|cyc| p cyc}

#would print:
["a", "b"]
["b", "c"]
["c", "d"]
["d", "e"]
["e", "a"]

As usual, several helpful rubyists responded with suggestions, one of which leads me to YADTBE (Yet Another Duck Typing Blog Entry)

Here’s one suggested solution:

module Enumerable
  def each_cycle(window, start=0)
    (start...length+start).each do |i|
      yield((i...i+window).map {|n| self[n % length]})
    end
  end
end

Nice solution, but as soon as I saw it, I saw a problem.

While this will work if the method were defined in the Array class, it doesn’t work in general for Enumerables. Let’s write a Kernel method to try some different Enumerables.

def try(enum)
  puts "Trying this #{enum.class} ==> #{enum.inspect}"
  begin
    enum.each_cycle(2) {|cyc| p cyc}
  rescue => ex
    puts ex
  end
  puts
end

Is it a duck?

Okay, now that we’ve done that, let’s try some enumerables:

try(%w{and a one and a two})

# Results in
Trying this Array ==> ["and", "a", "one", "and", "a", "two"]
["and", "a"]
["a", "one"]
["one", "and"]
["and", "a"]
["a", "two"]
["two", "and"]

So this works if the receiver is an Array.

Not the kind of duck we’re looking for

Ranges are Enumerable, so let’s try one.

try(1..5)
# Results in
Trying this Range ==> 1..5
undefined local variable or method `length' for 1..5:Range

Aye, there’s the rub, the each_cycle method assumes that the receiver implements both a length method, and a [] method.

This is the kind of requirement which leads some “strong ducktypers” to use a respond_to? test. But does that really work?

Odd ducks

Well, Hash implements both of those methods, so it would pass the ‘responds_to?’ test. Lets try one:

try(Hash[*(%w{zero, one, two, three}.zip((0..3).to_a)).flatten])
# Which prints:
Trying this Hash ==> {"two,"=>2, "three"=>3, "one,"=>1, "zero,"=>0}
[nil, nil]
[nil, nil]
[nil, nil]
[nil, nil]

Hmmm, not exactly what we are looking for. So it’s not enough to just be an enum with those two methods. We’ve just exposed another requirement of Enumeration#each_cycle, that the receiver be indexed by integers. We’re getting all nils because all of those nicely generated indices don’t have corresponding values in the hash, and hash is lenient about indices (i.e. keys) which aren’t keys.

So what happens if we try a hash which does have integer keys:

try(Hash[*((1..3).to_a.zip(%w{one, two, three}).flatten)])
#now we get
Trying this Hash ==> {1=>"one,", 2=>"two,", 3=>"three"}
[nil, "one,"]
["one,", "two,"]
["two,", nil]

Now, that sort of works, but it exposes yet another requirement, that the indices of the enum be in the range (0…enum.size).

I dub thee Sir Duck

If we use a slightly different hash:

try(Hash[*((0..3).to_a.zip(%w{zero, one, two, three}).flatten)])
# We get
Trying this Hash ==> {0=>"zero,", 1=>"one,", 2=>"two,", 3=>"three"}
["zero,", "one,"]
["one,", "two,"]
["two,", "three"]
["three", "zero,"]

So for a hash which looks enough like an Array, the code in Enum actually works.

Teaching a range to be our kind of Duck

One final example. Let’s revisit the case of a range. We can easily train the range class, or using a singleton class an individual instance of range how to support the each_cycle method:

class Range
  def length
    (last - first) + (exclude_end? ? 0 : 1)
  end

  def [](i)
    first + i
  end
end

# Now if we try this again:
try(1..5)

# We get:
Trying this Range ==> 1..5
[1, 2]
[2, 3]
[3, 4]
[4, 5]
[5, 1]

Okay Rick, what’s your point?

Although this exercise might seem a little contrived, it points out that in a dynamically ‘typed’ language like Ruby, whether or not a particular object is suitable for a particular use can depend not only on it’s class, or which modules are mixed in, or whether or not it responds to a particular set of messages, but on how it responds to those messages.

Using my metaphor of casting for a role in a play, finding the right actor isn’t just a matter of finding someone who went to a particular acting school, (i.e. has (had) the right class or classes), but it’s a matter of fitting the individual to the requirements. It’s akin to being interviewed for a job, for the employer doing a successful ‘type-check’ is a subtle process, and often leads to a probation period (i.e. testing).

Strong-typing advocates will no doubt see this as a flaw in dynamic systems. I see it as a strength. If you think about it honestly, there are a wide variety of requirements which aren’t expressed in strongly typed language, either because the type system doesn’t choose to express subsets of types, such as positive numbers, prime numbers, even numbers, etc.; or because state-oriented requirements such as an open file, don’t really fit into the notion of a static type.

In my experience, I’ve gotten to the point where I rarely make the kind of errors which are caught by static types, and the flexibility of dynamic languages far outweighs the relatively minor benefit of early static type checking. I’d rather rely on a growing set of test cases to fight bugs.


Comments

  1. Rob Biedenharn about 17 hours later:

    But you’re just falling into the same trap with your @Range@. If you really want to work for all enumerables, then you need to do something which only depends on @Enumerable#each@ so any enumerable object will work with @each_cycle@. You could use @each_with_index@ (which is provided by @Enumerable@) like:

    module Enumerable
      def each_cycle(window, start=0)
        wrap_start = []
        cache = []
        each_with_index do |e,i|
          cache << e
          if i >= start + (window - 1)
            yield cache[start, window]
            cache.shift
          else
            wrap_start << e
          end
        end
        wrap_start.each do |e|
          cache << e
          yield cache[start, window]
          cache.shift
        end
      self
      end
    end
    

    Then you leave it up to the @Enumerable@ contract. If you want to add something to @Array@, then by all means open the class and code away. However, if you want to do something for all classes that are @Enumerable@, then adjusting individual classes is always going to risk having a new class come along that breaks.

  2. Rick DeNatale about 23 hours later:

    I’m afraid that you missed my point.

    I wasn’t trying to show how to add this for all enumerables, but that duck types are more subtle than just inheritance hierarchies or sets of messages to be responded to.

  3. Rob Biedenharn 8 days later:

    I get your point, but when you say:

    bq. Is it a duck?
    Okay, now that we’ve done that, let’s try some enumerables:

    You imply that you expect that all things enumerable should work (or that it is desirable for them to). My point is that you need to understand the kind of feathers your ducks have. If you start running around and adding behavior to all the classes that include a given module (like @Enumerable@ in this case), then you might as well go straight for the module itself and take advantage of the contract that already exists.

    For something to be @Enumerable@, it has to have an @each@ method. When you are looking at @Array#each_cycle@, depending on other instance methods of an Array is fine, but as soon as the focus shifts to @Enumerable#each_cycle@, the semantics of @[]@ and @length@ need to be reconsidered.

    I think you do a good job of highlighting the pitfalls of blindly duck-typing, but your conclusion seems to fall a bit short. Perhaps in addition to how an object responds to a set of methods, you also need to consider why it’s the particular set of methods in the first place.