Robust Software

Tales of a code samurai

Ruby 2 - Module#prepend

As you may know Ruby 2.0.0 has been released. Despite the major version it is mostly an incremental release. However, it does include a few breaking changes so a major version is warranted. However, whilst the usefulness of most of the new features was obvious to me, I couldn’t say the same of Module#prepend.

However, whilst listening to the Ruby Rogues podcast on Ruby 2 one of the Rogues (I think it was Josh Susser) referred to memoization as a case where it would make a difference. Because of that I thought I would implement memoization in 1.9.3 and then in 2.0.0 using Module#prepend and see what came out of it. The process led to a personal “ah-ha” moment and so I thought I’d share what I found.

Method dispatch

First, I think it’s worth a little diversion into how method dispatch works in Ruby to understand what Module#prepend allows you to do that you couldn’t do before.

Every object in Ruby has an ancestor chain, you can find out what this is at any point by calling .ancestors on a class:

1
2
3
4
5
class Example
end

Example.ancestors
# => [Example, Object, Kernel, BasicObject]

Whenever you call a method in Ruby, the ancestor chain is traversed looking for a matching method to invoke. It follows it from beginning to end until it finds a match.

Modules fit into this story as when you use Module#include they are added to the ancestor chain after the class they were included in:

1
2
3
4
5
6
7
8
9
module After
end

class Example
  include After
end

Example.ancestors
# => [Example, After, Object, Kernel, BasicObject]

What Module#prepend allows us to do is insert a module in-front of the class it was prepended to:

1
2
3
4
5
6
7
8
9
module Before
end

class Example
  prepend Before
end

Example.ancestors
# => [Before, Example, Object, Kernel, BasicObject]

On its own that’s interesting, but what is it good for?

Memoization example

Memoization is a technique whereby you cache the result of a, usually expensive, function so that the result can be returned for subsequent calls without having to calculate it again. Sometimes memoization is built into the implementation of a method, sometimes it is implemented using the interceptor pattern so neither the caller nor the implementation are aware that memoization is happening. We’ll be using the latter type of implementation within our example.

There are already several implementations of this technique in Ruby, including the Memoist gem that was extracted from ActiveSupport::Memoizable that used to be part of Rails. The problem with these gem-ed implementations for our uses is that they tend to be quite hard to read due to them being highly optimized and doing things such as guarding against overwriting methods unexpectedly.

I’ll be creating a simple implementation to demonstrate the concept of memoization and the effect Module#prepend has on an implementation. This code is not intended for production use!

To require memoization, we really need an expensive call to make. We’ll use the same class and method for both versions of Ruby and their memoization implementations:

1
2
3
4
5
6
7
8
class Universe

  def meaning_of_life
    sleep 1
    42
  end

end

Calling Universe#meaning_of_life once takes just over 1 second, and calling it 5 times takes just over 5 seconds. This is a prime example of a case where you might want to memoize the result.

For both implementations we will augment the Universe class by memoizing the #meaning_of_life function through the use of a Memoize module and its memoize method which will be implemented differently in each version of Ruby:

1
2
3
4
5
6
7
8
9
10
11
class Universe
  extend Memoize

  def meaning_of_life
    sleep 1
    42
  end

  memoize :meaning_of_life

end

This will mean that calling Universe#meaning_of_life once takes just over 1 second, and calling it 5 times will also take just over 1 second too due to the result of the first call being memoized.

Ruby 1.9

The way Memoize is implemented in Ruby 1.9 is to rename the true implementation of the method and replace it with a method that calls the true implementation for the first call, and then returns the result of that first call for all subsequent calls.

This looks something like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
module Memoize

  def memoize(method)
    # Work out what to rename the true implementation to
    unmemoized_name = :"__unmemoized_#{method}"
    # Create an alias to the true implementation for the
    # new name
    alias_method unmemoized_name, method

    # Overwrite the true implementation with a memoizing
    # version
    define_method method do
      # Ensure we have a place to store the result in
      # case we memoize multiple methods
      @__memoized_results ||= {}

      if @__memoized_results.include? method
        # If we've already calculated the result of this
        # function, return it
        @__memoized_results[method]
      else
        # Otherwise calculate the result by calling the
        # original implementation and store it for
        # future calls
        @__memoized_results[method] = send(unmemoized_name)
      end
    end
  end

end

If we change the Universe class’s implementation to look like it will after this bit of meta-programming, it would look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Universe

  def meaning_of_life
    @__memoized_results ||= {}

    if @__memoized_results.include? :meaning_of_life
      @__memoized_results[:meaning_of_life]
    else
      @__memoized_results[:meaning_of_life] = __unmemoized_meaning_of_life
    end
  end

  def __unmemoized_meaning_of_life
    sleep 1
    42
  end

end

Ruby 2.0

Whilst the same implementation used for 1.9 would work in 2.0, due to Module#prepend we have a second option involving anonymous modules.

Rather than renaming the method, we can create an anoymous module that has a method with the same name. This module can then be prepended to the class so that it comes before the real implementation in the ancestor chain. The method on the module can then call the true implementation of the method via super for the first call, and then return the result of that first call for all subsequent calls.

This looks something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
module Memoize

  def memoize(method)
    # Create an anonymous module
    memoizer = Module.new do

      # Define a method in the module with the same name
      define_method method do
        # Ensure we have a place to store the result in
        # case we memoize multiple methods
        @__memoized_results ||= {}

        if @__memoized_results.include? method
          # If we've already calculated the result of
          # this function, return it
          @__memoized_results[method]
        else
          # Otherwise calculate the result by calling
          # the original implementation and store it for
          # future calls
          @__memoized_results[method] = super()
        end
      end

    end

    # Prepend the anonymous module to the class so that
    # its method is called first
    prepend memoizer
  end

end

If we change the implementation of the Universe class to look like it will after this bit of meta-programming, it would look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
module AnonymousModule

  def meaning_of_life
    @__memoized_results ||= {}

    if @__memoized_results.include? :meaning_of_life
      @__memoized_results[:meaning_of_life]
    else
      @__memoized_results[:meaning_of_life] = super()
    end
  end

end

class Universe
  prepend AnonymousModule

  def meaning_of_life
    sleep 1
    42
  end

end

Comparison

In terms of lines of code, the two implementations are virtually the same. In fact, their structure is virtually identical (whitespace ignored and comments removed):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
--- memo19.rb
+++ memo20.rb
@@ -1,8 +1,7 @@
 module Memoize

   def memoize(method)
-    unmemoized_name = :"__unmemoized_#{method}"
-    alias_method unmemoized_name, method
+    memoizer = Module.new do

     define_method method do
       @__memoized_results ||= {}
@@ -10,9 +9,13 @@
       if @__memoized_results.include? method
         @__memoized_results[method]
       else
-        @__memoized_results[method] = send(unmemoized_name)
+          @__memoized_results[method] = super()
       end
     end
+
+    end
+
+    prepend memoizer
   end

 end

I believe the choice between the two comes down to taste, and I prefer the Module#prepend method. The reason for this is that it feels like the Universe class has been tampered with less.

In the 1.9 implementation it has the method renamed and replaced, whereas in the 2.0 example it has an extra module added. Leveraging the ancestor chain also feels like a cleaner way of achieving the goal of intercepting a call to a method.

Conclusion

I feel Module#prepend is a good addition to the Ruby language. I don’t imagine I’ll use it regularly, nor can I think of previously unsolveable problems that it solves.

However, from working through this memoization example I can see that it does have its uses, and that perhaps we can now solve some problems in cleaner ways.

If you’ve got further thoughts on Module#prepend, or anything else Ruby, you can find me on Twitter or get in touch via email.