The Quest for Simple Active Record Repositories

The goal of this post is to explore the various ways we might organize our Active Record model scoping behavior, before arriving at a simple pattern for defining and using repositories.

When Dinosaurs Roamed the Earth

One of the joys provided by Rails is Active Record and its fluid, chainable interface for retrieving model instances. For example, given a model called Dinosaur, we might narrow down the available records with a method chain like so:

Dinosaur.cretaceous.scary

To implement the .cretaceous and .scary methods, lots of folks will quickly reach for the scope macro method available on subclasses of ActiveRecord::Base:

class Dinosaur < ActiveRecord::Base
  scope :cretaceous -> { where(age: 66..145) } # millions of years
  scope :scary -> { where(teeth: "sharp", claws: "long") }
end

However, as query complexity grows, use of the scope macro quickly gets unwieldy. A better option is to consistently use plain ol’ class methods for scopes:

class Dinosaur < ActiveRecord::Base
  def self.cretaceous
    where(age: 66..145) # millions of years
  end

  def self.scary
    where(teeth: "sharp", claws: "long")
  end
end

This approach gives us a first-class method for each scope, and as such, a bit more breathing room. We’re not there yet, though. While more maintainable than the scope method, class methods within the Active Record base class suffer from a problem of bloated context. For upon opening such a class, we’re likely to find a whole lot going on:

class Dinosaur < ActiveRecord::Base
  # associations

  # validations

  # lifecycle callbacks

  # gem-specific macros

  # scoping class methods
  def self.cretaceous
    where(age: 66..145) # millions of years
  end

  def self.scary
    where(teeth: "sharp", claws: "long")
  end

  # instance behavior methods

  # kitchen sinks
end

Active Record models sure do a lot. One of the things they do is provide a repository interface to the underlying data.

What’s a Repository?

In a summary of the repository pattern detailed in Patterns of Enterprise Application Architecture, Edward Hieatt and Rob Mee note:

A system with a complex domain model often benefits from a layer, such as the one provided by Data Mapper (165), that isolates domain objects from details of the database access code. In such systems it can be worthwhile to build another layer of abstraction over the mapping layer where query construction code is concentrated. This becomes more important when there are a large number of domain classes or heavy querying. In these cases particularly, adding this layer helps minimize duplicate query logic.

The scoping API provided by Active Record, and in particular its Relation class, implements this pattern. The trouble that many Rails developers run into however, is that Active Record base classes implement this and many other patterns simultaneously. References to these classes throughout the system might embody a desire to invoke any one of the patterns. Developers are unable to go to a single place to view the model’s query interaction with the data storage layer, and only that interaction. The problem is frequently exacerbated when the use of Relation methods are not confined to the model class.

A Tree Falls in the Forest

Active Record is great for the ease with which it lets us define the many behaviors it supports, but it clearly has a single responsibility crisis. Many developers have noticed this crisis and are trying to do something about it. ROM (Ruby Object Mapper) is a promising project that aims to provide an alternative persistence and mapping layer with a more segmented API. Others have done work to separate out Active Record’s concerns without completely abandoning the framework, although these approaches are somewhat more complex in their goals and implementation than might be necessary to address our problem: finding a simple home for our scopes.

One approach that has gained popularity is the construction of objects to encapsulate a particular query:

class CretaceousDinosaurQuery
  def initialize(relation = Dinosaur.all)
    self.relation = relation
  end

  def results
    relation.where(age: 66..145) # millions of years
  end

  private

  attr_accessor :relation
end

Now we’re getting somewhere! This class has a single responsibility. The only thing it cares about is talking to Active Record in order to return a result set of Cretaceous dinosaurs. It might be used like so:

CretaceousDinosaurQuery.new.results

But wait, we also care about other types of dinosaurs. With this pattern, if we want to find both Cretaceous and scary dinosaurs, we’ll need to pass one query in to the other:

CretaceousDinosaurQuery.new(
  ScaryDinosaurQuery.new.results
).results

This is starting to look not-so-great. Our application is likely to gain lots of scoping behavior as time goes on, and so the number of query objects that we’ll need is likely to explode. And what happened to our nice, chainable relation interface?

At this point, one might think it would be just as well to pull out the scoping methods from our Dinosaur class into a module, and extend the Active Record base class with that module. That would let us define our scopes in a file separate from the rest of our model, while keeping our chainable API. However, the problem with that approach is that all it’s really doing is moving code around. You will have relocated the scoping code, but at runtime it will still reside directly on the Active Record base class, and it will still need to be tested along with all of that class’s behavior. Fortunately, it is possible to truly silo our application’s scoping behavior while keeping the familiar chainable Active Record API.

A Place to Call Home

What would be really nice is if our scoping behavior for a particular Active Record base class could live in its own honest-to-goodness object, while retaining chainability of scopes. We can accomplish this with a simple repository class. It takes the query object approach of instantiating an object with a relation, and then returns a new repository from each scoping method:

module Repositories
  class Dinosaur
    attr_reader :relation

    def initialize(relation = Dinosaur.all)
      self.relation = relation
    end

    def cretaceous
      self.class.new(
        relation.where(age: 66..145) # millions of years
      )
    end

    def scary
      self.class.new(
        relation.where(teeth: "sharp", claws: "long")
      )
    end

    private

    attr_writer :relation
  end
end

This class cares only about interacting with the database to retrieve different kinds of dinosaurs. It has a single reason to change: to support requirements specific to dinosaur retrieval. The application now has a single place to absorb changes to those requirements. Scoping behavior can be tested separately from dinosaur instance behavior. And we have retained our fluid relation interface:

repository = Repositories::Dinosaur.new.cretaceous.scary

When we’ve built up our scope the way we’d like and it comes time to retrieve the records, or pass the scope along for further processing (perhaps by merge), we need only call the repository’s relation attribute reader:

@dinosaurs = repository.relation

And with that, our quest for a simple Active Record repository comes to a close — for now. Let us know what your favorite dinosaur’s preferred repository implementation is by tweeting @forakerlabs!

Tweet at Jeff

Share this post!