In Ruby Globals, Aren't Always Global
Posted by Rick DeNatale Mon, 17 Nov 2008 20:00:00 GMT
Ola Bini, just wrote about how he’s planning to have a special variable to expose the result of a regular expression match in his new language Ioke.
The idea is to introduce a special variable called it, which holds the result of the conditional expression in an if or unless statement, or method in Ioke’s case. Ola says that this is different from the special Ruby globals like $~ because For one, it’s not a global variable. It’s only available inside the lexical context if an ‘if’ or ‘unless’ method. It’s also a lexical variable, meaning it can be captured by a closure. And finally, it’s a general solution to more things than the regular expression problem.
As it turns out, I learned at RubyConf that the Ruby ‘globals’ which give access to the results of the last regular expression match, aren’t really globals at all. This came up in a conversation after the MagLev presentation, when Charlie Nutter asked Allen Otis of Gemstone how they were handling those variables, since they are really frame locals. Matz was standing there and quickly confirmed Charlies observation.
So $~ and it’s friends, like $1, are really more like special names for local variables. Other than the lexical ‘look’ they behave more like locals than globals.
I just concocted a completely bogus Ruby program to demonstrate this.
def lambdas(str, re)
re.match(str)
[lambda {$~}, lambda {|str| re.match(str)}]
end
m_lambda, match_lambda = *lambdas("this", /(this|that)/)
puts "(m_lambda.call)[1] is #{(m_lambda.call)[1].inspect}"
/(foo)/.match("foo")
puts "$~[1] is #{$~[1].inspect}"
puts "m_lambda still is #{(m_lambda.call)[1].inspect}"
match_lambda.call('that')
puts "now m_lambda is #{(m_lambda.call)[1].inspect}"
puts "$~[1] is still #{$~[1].inspect}"The lambdas method returns two lambdas, the first simply returns the current value of $~ when evaluated, the second does a match against the regular expression passed into the lambdas method with a new string, on demand.
When run this produces the following output:
(m_lambda.call)[1] is "this"
$~[1] is "foo"
m_lambda still is "this"
now m_lambda is "that"
$~[1] is still "foo"Note that $~ in the outer context is a different variable than $~ in the context of the invocation of the lambdas method. Let’s call these the outer and inner $~ variables respectively. Doing a regular expression match in the outer context leaves the inner $~ unchanged, while calling the second lambda affects the value of the inner $~ but leaves the outer $~ unchanged.
As they say, you learn something new every day.










Of course, the one defining fact of a global variable is that it’s available everywhere – which the $-variables are. They shift value, and the frames where the value shifts can be captured, but you can refer to them anywhere (and get nil), just like all Ruby globals.
Note also that there’s some thread-local globals too. The “last error” global, $!, is one such case, showing the last error raised in the current thread.
I’ve managed to get my brain to except non-global globals by defining them as “globally accessible” rather than “globally the same”. They’re always present and always retrievable, but not necessarily globally scoped.
In Lisp, variables like Ioke’s “it” are produced by Anaphoric macros:
http://www.bookshelf.jp/texi/onlisp/onlisp_15.html
Ola, and Charles,
I’m don’t think that I can agree that a name which is universably referenceable is a global, unless it denotes the same reference.
I think of these frame local and thread local ‘globals’ as being more pseudo-variables than globals.
Sounds like they’re dynamically scoped. Dynamically scoped symbols can get a new binding in a function and the new binding is what is visible in functions called by that function – but it does not change the binding for that symbol in functions not nested in the call chain.
Dynamic scoping contrasts with static or lexical scoping which is what most languages use: in lexical scoping the binding of a symbol is determined entirely by it’s position in source code.
Old Lisps (prior to Scheme and Common Lisp) were dynamically scoped. In Common Lisp and Perl variables are optionally dynamically scoped but lexical scoping is usually preferred.
http://en.wikipedia.org/wiki/Scope_(programming)
James,
No, this is different. Consider the following Ruby program:
Which when run produces: If $~ was dynamically scoped, when the reference in the bar method was encountered, the current activation frame would be searched for a binding, and when none was found, the binding in the calling activation frame (which is an activation of the foo method) would be searched next finding that Matchdata.So I’d say that $~ and it’s friends are lexically scoped, and they aren’t global because although the name of the variable is accessible ‘globally’ the binding of the name to a value is determined lexically.
They seem to behave much like normal Ruby local variables, except that they can be used as a LHS value without being set.
I stand corrected. It’s more like there’s an implicit a definition ”$~ = nil” at the top of every program AND at the top of every named function. The top level binding $~ is “global” in extent(lifetime), but it’s lexically masked by the implicit bindings in functions.