Dirty Rotten Scoundrels

Posted by Rick DeNatale Wed, 03 Feb 2010 17:45:32 GMT

I had an interesting debugging session the other day.

The Rails app I'm working on for a client uses oauth to get user info from one of the popular professional social networking sites.

In order to keep from hitting that site with an oauth request too often, yesterday I worked on a story to add throttling. The client wanted to refresh the data no more often than an application configurable refresh period.

Easy Peasy

Easy enough! I already had a model responsible for keeping the oauth token and returning the relevant data. I added a text field to the model where I could cache the data as a serialized object. Then I added code to the method which returned the data which checked to see if the refresh period had elapsed since the record was updated, and if so, make the oauth request to refresh the data, and save the updated record.

Hang on!

Then I started trying it out, and I got a surprise. The first time I asked for the data, it got fetched via oauth, and the model got saved. I asked again, expecting that the cached data would be returned, but once again the oauth request went out.

After playing with it some more in script/console, and adding some debugging printouts, I finally realized that the problem was that the updated_at field in the record wasn't changing. I saw that the sql query generated by the save was inserting the data value, but not a new value for updated_at, strange!

Alternatives

I'd been using the update_attribute method to change the data value. Just for giggles, I decided to see if setting the value and then explicitly saving the record would fix it. Nope! Exactly the same SQL was generated, with updated_at left unchanged.

Digging deeper

This is the point where I resort to reading code. This project has Rails vendored, so I had no qualms about inserting some debugging alterations to the code. Actually I'm not immune to temporarily monkeying with non-vendored gems to debug arcane stuff like this either.

So the first thing was to find the code in ActiveRecord which writes the updated_at attribute. I found it in activerecord/lib/activerecord/timestamp.rb

  def update_with_timestamps(*args) #:nodoc:
    if record_timestamps && (!partial_updates? || changed?)
      current_time = current_time_from_proper_timezone

      write_attribute('updated_at', current_time) if respond_to?(:updated_at)
      write_attribute('updated_on', current_time) if respond_to?(:updated_on)
    end

    update_without_timestamps(*args)
  end

Those familiar with Rails conventions will realize this as an override to the ActiveRecord::Base#update method using alias_method_chain.

So, in order for the timestamps to be written, record_timestamps has to be truthy, partial_updates? has to be falsy or changed? has to be truthy.

Now record_timestamps is a class_inheritable attribute on ActiveRecord::Base, which allows timestamp recording to be turned on or off for all or particular ActiveRecord classes, and that was true by default. The method partial_updates? determines whether or not the partial update feature has been turned on or off for the model, again this is true by default

This leaves changed? a method added by dirty tracking which indicates whether anything in the record has changed and therefore needs to be written back to the database as part of the update.

So facing this code, I inserted a puts to show me what those three values were.

And the culprit turned out to be the result of the changed? method, which was returning false!

Active Record Dirty Attribute Tracking

The partial update feature, along with. dirty attribute tracking, was introduced in Rails 2.1. Dirty attribute tracking keeps track of the value of attributes when a model is fetched, and partial update only writes changed attributes to minimize the amount of data send to the SQL adapter. They use an instance variable @changed_attributes, a hash mapping the name of any attributes whose value has been changed maintained by the write_attribute method. The changed? method indicates whether any of the attributes have changed. There's also a will_change method which alerts ActiveRecord that a an attribute might change directly, say with something like Book.title.capitalize!

So in this case my model fetched the data via an oauth request, and since the value hadn't changed, dirty tracking determined that the model object hadn't changed, so now updated_at timestamp was written.

I'd run afoul of dirty tracking/partial update before. Back in 2008, I got hired by a startup with a sizable Rails app they'd been working on, which used lots of ActiveRecord callbacks and observers. One of my assignments was to upgrade it from Rails 2.0 to 2.1, but the introduction of partial update made some subtle changes to things like the ordering of those callbacks and observed events which badly broke the application. The CTO quickly came to the conclusion that the upgrade was going to cost more than he deemed it to be worth. So, in our case, the jump from Rails 2.0 to 2.1 was a much bigger one than from Rails 1.x to 2.0.

The Fix

In this case, the fix was considerably simpler, use update_attributes(:data => the_date, :updated_at => Time.now). This worked and ensured that the updated_at attribute gets updated whenever the data is fetched.

One More Little Mystery

After fixing this, I moved on, but one little thing nagged at me. Since the data attribute hadn't actually changed its value, why hadn't the partial update code kept it from being written in the first place?

So I went back and looked a little harder at the active record code and found this:

  def update_with_dirty
    if partial_updates?
      # Serialized attributes should always be written in case they've been
      # changed in place.
      update_without_dirty(changed | (attributes.keys & self.class.serialized_attributes.keys))
    else
      update_without_dirty
    end
  end

Since the data attribute was serialized, it always gets written!

So Is This a Bug in ActiveRecord?

Actually I could make the case that there's one of two bugs here:

  1. Serialized attributes shouldn't be written under partial update if they haven't actually changed, or
  2. The updated_at attribute should be changed if there are any serialized attributes even if the values of those serialized attributes haven't changed.

To be honest, I'm not sure which if either of these is a bug. In my use case, I'd prefer that AR used option 2. On the other hand I can see cases where it would be preferable if updated_at reflected the last time the record actually changed.

On the other hand, I'm not sure that the code right now updates updated_at if the only change to a record is in serialized attributes at least one of whose values has changed.

But, I've got other things to worry about right now, so I guess I'll follow Scarlett O'Hara's advice and "think about that tomorrow!"


Holy CoW!

Posted by Rick DeNatale Wed, 16 Dec 2009 14:38:00 GMT

Last week the Phusion guys, gave a talk at Google about the implementation of Ruby Enterprise Edition, and it's now available on YouTube.

It's good stuff if you are a VM geek, like me. The cover two major topics

  1. How they made the Ruby 1.8 garbage collector Copy on Write (CoW) friendly, which allows multiple Ruby processes to share memory for unchanged objects. This greatly reduces the footprint of a typical Rails Deployment for example. They talks about a series of attempts to change the Ruby GC to move the mark bits from the objects themselves to a separate memory structure. The result is significant memory savings, a slightly slower GC, but overall faster Ruby performance because they replaced malloc with a faster allocator (from Google apparently) which more than offsets the GC performance
  2. How a contribution from the Event Machine developers improves thread context switching for Ruby 1.8's user space threads. Ruby 1.8 handles thread context switches by copying the execution stack to and from the heap, which can result in significant time being taken up by memcopy calls. Ruby Enterprise edition now has an optional feature which instead switches the base stack pointer on a thread switch. This is processor specific, and is only available right now for Intel 32 and 64 bit processors.

So if this kind of stuff interests you, I'd recommend spending the half hour or so that it takes to watch the video


Snow Leopard, remapping Caps-Lock to Ctrl

Posted by Rick DeNatale Tue, 24 Nov 2009 16:25:00 GMT
I'd just noticed the other day that my MacBook was no longer mapping the caps-lock key to control. Since I, always one to jump on the latest fad, had just been playing with zsh, I thought it was involved. However, it turned out to be a little Snow Leopard Feature. The Keyboard preference pane "modifier keys" dialog now has a drop-down selection list for which keyboard, and you now have different settings for each keyboard. In other words for a laptop you can have different settings for the internal keyboard and an external keyboard (maybe even different for a usb vs. bluetooth external keyboard) Not only that but it would appear that the settings for the internal keyboard can be different depending on whether or not the external keyboard is attached.

Site Upgrade

Posted by Rick DeNatale Fri, 13 Nov 2009 02:22:00 GMT
Just a brief article to announce that I've just upgraded this blog to Typo 5.3 to 5.2. I'd run into troubles with upgrading back when my development machine was running Leopard. I just got around to trying it again, since I'd completely rebuild ruby/gems etc. I'm not sure it will, but I've got my fingers crossed that this upgrade will solve the intermittent problem of the rss/atom feeds occasionally going berzerk and ignoring article boundaries.

Karmic Koala, Bind9, and Apparmor

Posted by Rick DeNatale Mon, 02 Nov 2009 01:36:00 GMT

Today was the day to upgrade the server running this blog to Ubuntu 9.10 "Karmic Koala".

Late in the upgrade, during the process of installing and configuring the Debian packages, the upgrade hung. I had just tried to stop bind9 and given an series of error messages indicating that rndc was unable to communicate with bind9.

After waiting a while I hit control-c and was warned that this might leave the system in an inconsistent state, but I figured I had little to lose, so I clicked ok. The upgrade continued, but didn't reboot, saying that there had been a problem

I tried to start bind9 manually, both with rndc start, and /etc/init.d/bind9 start but only got bad news.

So I then tried to start it while tailing the syslog, and noticed that I was getting an error about a failure to open /etc/ssl/openssl.cnf for read. This was strange since the permissions on that file allowed reading by anyone.

A little googling eventually revealed that I was running afoul of AppArmor, something I hadn't encountered before. It turns out that Ubuntu doesn't automatically include an AppArmor profile for bind9, not sure why.

The solution seems to have been to manually install the apparmor-profile package with apt-get.

After verifying that I could manually start bind9 and talk to it with rndc, I rebooted, and as they say "so far so good."


It's the Little Things

Posted by Rick DeNatale Tue, 27 Oct 2009 14:26:00 GMT
$ rvm use 1.8.7
$ ruby -e'puts [1, "a"].to_s'
1a
 $ rvm use 1.9
$ ruby -e'puts [1, "a"].to_s'
[1, "a"]

I've been working on converting my client's Rails App to Ruby 1.9.

It's been fairly painless, but there have been a few stumbling blocks. One has been the use of arrays within string interpolations, and the difference in the result

Prior to Ruby 1.9 Array#to_s was a synonym for Array.join which resulted in the concatenation of the results of sending to_s to each element. In Ruby 1.9 Array#to_s is the same as Array#inspect.

This has had a tendency to produce subtle problems which are at time hard to track down. Once they are found, the solution is to do something like changing:

"Whatever #{some_array}"

to:

"Whatever #{some_array.join}"

A Rubyist's Guide to Dating

Posted by Rick DeNatale Mon, 19 Oct 2009 13:46:00 GMT

I'm speaking about the "RiCal experience" tomorrow evening 20 October 2009, at the October meeting of the Raleigh Ruby Brigade. As usual the meeting is at Red Hat HQ in Raleigh and starts at 7:00 pm.

I plan to talk about calendars, time, time zones, camels, icalendar, Ruby implementation of complex "stuff", among other things.

I hope to see you there!


Oh the Irony

Posted by Rick DeNatale Sat, 10 Oct 2009 17:18:00 GMT
PinkMugFront

There's a bit of a buzz on the interwebs, that Microsoft's project to compete with the iPhone is having some problems.

What I find ironic is that they decided to call this project "Pink". Apple once had a project with the same name, which was to be the 'new' operating system for the Macintosh, as a replacement for System 7. Pink was to be an operating system built around an application framework which was to provide the API. This was a popular 'strategic' idea in those days, Apple, IBM and, yes even Microsoft were pursuing it.

Apple's Pink attempted to take the ideas from MacApp which had just made the transition from Object Pascal to C++, and grow them into an operating system. Larry Tesler had advocated this and John Sculley, then Apple's presidency during the "inter-regnum" period, had sunk lots of money into the project which was headquartered in the original building where the Macintosh had been developed, when it was the lair of the "Dread Pirate" Job's merry band. The project had gotten bogged down, and Sculley was on the verge of killing it, unless other 'investors' could be found. Since IBM was enamored with such things back then, a task-force was dispatched to Cupertino to look at Pink. I was tapped for the task-force because of my expertise in Object Oriented programming and visibility due to my connections with VisualAge and Smalltalk, Despite whatever technical misgivings which might have resulted, in other words they didn't listen to me, IBM bought in. The result was an alliance which formed two jointly companies which were jointly owned by IBM and Apple, Kaleida which focused on multi-media, and Taligent led by IBM Executive Joe Gugliemi, where Pink died a slow death. The longest term impact of the alliance was that it got Apple to agree to switching from the Motorola 68K family of microprocessors, which had been used in Macs until then, to the IBM PowerPC. But, since Motorola was a large IBM customer, IBM licensed the PowerPC to them, so that most Macs used IBM architected chips actually produced by Motorola until the "Dread" Pirate came back and eventually switched to using Intel chips.

Of course Microsoft's "Pink" has no relation to Apple's other than the name, but I can't help but finding this all ironic.


Total World Domination through Storyboards, Wire-Framing, and Balsamiq Mockup

Posted by Rick DeNatale Sun, 04 Oct 2009 22:25:00 GMT

I admit it, I'm a much better programmer than graphics designer.

But getting the overall UI structure and flow of an application "right" requires a certain amount of "right brain" activity. And one of the best approaches is to create a "wire-frame" story board.

I've worked in situations where the "chief designer" would hand out web pages carefully crafted in PhotoShop, and ask for "pixel-perfect" renditions. UI discussions would be held using these images.

The problem is that this can lead to quite a bit of yak shaving, particularly of the bikeshedding variety.

I'm a big movie fan, in my dreams I've wanted to be a movie director if I couldn't be a programmer, or a Rock and Roll star, and among my heroes I count Hitchcock, Scorsese, Peter Jackson, and many others. I think that designing user interfaces is quite similar to designing movies. As someone who has watched a lot of that supplemental material on DVDs of recent an not so recent movies, I know how often directors use story boards to work out the flow and pace of a movie.

Storyboards are a rapid prototyping technique. It's a lot cheaper to work with easily replaceable "low-fidelity" artists drawings than to shoot the actual movie. Ideas can be discussed with little regard to the cost of the story-boarding.

Henchmen Small

User Interface design can, and should make use of a similar technique often called wire-framing, in which the sequence of presentations, the "scenes" if you will, which will take the form of things like web pages and dialog boxes, are sketched out and annotated as to what actions cause which transitions.

A few weeks ago I became aware of a product called Balsamiq Mockups which does a nice job of wire-framing.

Mockups exists both as an online web application and as an Adobe Air desktop version.

One way to think of Mockups is as a specialized version of PowerPoint or Keynote, although that's missing the point a bit. The similarity is that you can create storyboard pages, which are similar to pages in a presentation. There are some differences that make it a specialized wire-framing tool. First Mockups comes with lots of built-in user-interface components for web applications, desktop applications, and even iPhone apps. Second, rather than presenting the pages sequentially, you set up transitions between the pages based on user interactions, like clicking on buttons.

I find the cartoonish feeling of the pages quite nice. They get the idea across without inviting unnecessary quibbling about pixel-perfection. This is a tool for getting the big ideas down, after that a competent web-designer can polish things.

I've been looking for something like this for a while. I know quite a few Rails developers and designers who are pretty pumped up about this tool. It's well worth having a look.


RSpec/Textmate Pro-Tip

Posted by Rick DeNatale Wed, 23 Sep 2009 12:34:00 GMT

I often need to output debugging info in specs when something just isn't working. No sweat, just put in a puts or two, and run the spec.

Of course I'm usually doing this in Textmate using the RSpec bundle, which displays the results in a nice window formatted with html.

RSpecFormattedResults

But that causes problems with my debugging output

Bad Puts Format

The problem is that the puts output is just raw text so the HTML formatter ignores line breaks.

At one of the Radiant hack fests, either John Long or Adam Williams taught this old dog a simple new trick. The suggestion was to define a new Kernel#rputs method as a wrapper for puts which 'wraps' the arguments in an html <pre> tag. Then just call rputs (the r is for rspec of course) in the spec and or code under test.

Good Rputs Format

This works great as long as you run the specs under RSpec. But if you run them in the shell you see something like this:

$ spec spec/ri_cal/bugreports_spec.rb <pre> Booga booga </pre> <pre> this should be another line </pre> ....... Finished in 0.525324 seconds 7 examples, 0 failures

I've been putting up with this minor annoyance for a while, but this morning it struck me that I could define rputs conditionally depending on whether the spec was being run in Textmate or not. The trick is that Textmate defines one or more environment variables whose name starts with 'TM_'. So, without further ado, here's the new fragment from my spec_helper.rb:

  module Kernel
    if ENV.keys.find {|env_var| env_var.match(/^TM_/)}
      def rputs(*args)
        puts( *["<pre>", args.collect {|a| CGI.escapeHTML(a.to_s)}, "</pre>"])
      end
    else
      alias_method :rputs, :puts
    end
  end

With this change, running in the shell now looks like this:

$ spec spec/ri_cal/bugreports_spec.rb Booga booga this should be another line ....... Finished in 0.348086 seconds 7 examples, 0 failures

A nice refinement to my bag of tricks!

Update 27 Sept 2009

One slight change. I notice that String#start_with? only exists in Ruby 1.8.7, 1.9 and is apparently also defined the activesupport gem for <1.8.7. I changed the spec helper to use a regular expression match instead which works in all cases.