Fixie

Building websites better, faster, stronger

The Blog

Concurrency's a Bitch

Published: January 3, 2009

On this Rails application I’m writing, I have a Person model. When People are displayed to the user, the sort order is more complicated than a <name asc>

  1. First are featured People with a picture
  2. Then are featured People without a picture
  3. Then People with pictures and bio
  4. Then People with a picture
  5. Then People with a description
  6. Then People with an address and phone number
  7. And so on

What I ended up doing is adding a “priority” integer attribute to the Person model. On a method that runs before the Person is saved, it figures out what attributes the Person has, and then assigns a priority number to the Person. I gave a priority value of 10 to featured People with a picture, a priority value of 9 to a featured Person without a picture, and so on. Indexed this column and whammo, the sorting works great, just add an "ORDER BY priority DESC" SQL clause to your search statements.

Then it occurred to me when I was in the shower: "What happens if the rules change and People should be ordered differently?" It’s happened twice so far, but it’s no problem since the application hasn’t launched yet. I make the code change to the module (PersonRules, need a better name) and then run:

Person.find(:all).each do |Person|
  Person.priority = PersonRules.generate_priority(Person)
  Person.save!
end

And that works. Almost.

What happens if after you have loaded the Person objects in the find call and a user changes the name of Person while you’re looping through the People?

Prior to Rails 2.2, which I haven’t really upgraded to yet for most of my applications, that change would be overridden1. Say they update the name of the Person. But the above code loop has already loaded the original Person object into memory and is going to be operating on that data—the priority for the original Person object (with the old name) will be updated and then saved. So the user’s change to the Person name would be overwritten with the old name and new priority.

Ok, so you load the Person with "select for update" (I think in postgresql, don’t know if this syntax is what mysql uses). You do this like:

Person.find(:all).each do |Person|
  transaction do
    Person.lock!
    # now this bit of code is guaranteed to have an exclusive lock on the Person
    # if another bit of code is updating the Person at the same time, the database
    # will hang until it's done
    Person.priority = PersonRules.generate_priority(Person)
    Person.save!
  end
end

This should work, afaik. I don’t like it though:

  • You have to remember to lock the Person object whenever you do something like this.
  • If generating the priority took a while, the Person object would be locked for a while.
  • It adds more code.
  • How do you test this?

There’s another way to do this, that’s using optimistic locking. You add a lock_version column to your model. That version number is updated every time the Person object is updated. So in the above example, the Person object would be loaded with a lock_version of 10, then when the Person’s name is updated by another user, the lock_version would increment to 11, and then when the priority is generated and the original Person object is saved, ActiveRecord would notice that the lock_version value for the Person has been modified and would throw a nasty exception.

Now that works, except that you have to be aware of the possibility of the exception and rescue it if you don’t want the request to die a fiery death. Now, in most circumstances, not rescuing it would work just fine, since this type of problem wouldn’t happen more often, and it’s better to crash and burn as opposed to have the possibility of bad data. So there’s a case to be made for always adding a lock_version column to all of the models that have the remote possibility of concurrent updates.

But say you wanted something that didn’t require locking the Person object, or rescuing lock_version exceptions all the time. I can think of a couple of options that might work—I have not tried them out yet.

  • Add a version column to the Person model. When the Person is updated, instead of updating the existing Person database row, insert a new one with the Person data and update the version number. Then when you select the Person, add a "order by version desc". Problem with this approach is that any other objects that refer to the Person would need to be updated with the new Person id. But it has the advantage of keeping old versions of People around, you could then easily rollback or provide "undo" functionality.
  • Add a new model: PersonPriority. This would have two columns: person_id and priority. When you want to update the priority of the Person, you don’t have to modify the Person itself, you can update this model. Problem: extra table/model, and doing a sort on the Person would be more complex. (Can you do this indexing with sphinx?)

How do you handle concurrency in your applications? Am I missing any approaches here?

1 I believe in Rails 2.2, Rails only updates the parts of models that change. So if only the name field of a Person has been changed, when the object is saved, only the name field update is included in the generated SQL update statement.