Ruby Background Tasks with Starling

At Inquisix, we help sales professionals exchange trusted referrals. To do that requires several background tasks, some that could take 10-15 minutes to process. Obviously, I can't make a client wait for that, so I needed a system that could handle background tasks. At first, I started with backgroundrb, and it worked just fine. Backgroundrb was in production for two months while Inquisix grew. However, there were a few things about backgroundrb that bothered me:

  • It uses a lot of memory. Every worker creates at least one process. Plus, there is a master process to watch everything and deal with communication. It doesn't take much before you end up with 5-6 processes. I had to upgrade my test server just to deal with the extra memory requirements.
  • It's not easy to build a queue with control over threads without creating a ton of processes.
  • Too many times, I wanted to do something pretty straight forward, but I had to dig through the backgroundrb code to figure out the backgroundrb way. For example, don't ever call sleep in a backgroundb thread pool. You need to call next_turn instead.

After a while, I decided to look for a simpler way that would scale better without using so much memory. I decided a more traditional queue system would work better for me. At a former company, I built up an enterprise system based on queues that processes millions of transactions across dozens of servers. Something based on queues would work for me, but I did not want to take on the complexity of JMS, ActiveMQ (or some other queue), ActiveMessaging, etc. As usual with Ruby projects, I looked around on the web. Within a few minutes, I came across Starling and Sparrow. Both a Ruby queue systems using the memcached interface. That means I can use the memcache-client gem that I already use. Starling was developed at Twitter for background processing, so I figure it's got some testing behind it. Sparrow is newer, but basically the same. However, there isn't much experience with Sparrow, so I settled on Starling as my queue server.

To install Starling:

sudo gem install starling
sudo gem install memcache-client

Now, I needed a way to use my new queue server for background tasks. Again, a few minutes of looking, and I found Workling. It didn't have everything I wanted, but it was nice and simple, and it had almost everything I wanted. I use Piston for all my plugins, so here is how to install with that:

piston import http://svn.playtype.net/plugins/workling/ vendor/plugins/workling
svn commit -m "added workling"

Make sure you commit now because we will be making some changes to Workling later. Piston will get confused and toss your changes if you don't commit first.

Client Code

First, create a worker in app/workers/my_worker.rb

class MyWorker < Workling::Base
  def do_something_big(options = {})
    SomeModel.do_something_big(options[:some_arg])
  end
end

Anything in app/workers that inherits from Workling::Base will get picked up automatically as a worker. Workers are basically listeners on a Starling queue. By default, Workling defines queues based on class and method. There will be a queue for every method in every class that inherits from Workling::Base.

Now, you can call your worker asynchronously anywhere like so:

MyWorker.asynch_do_something_big(:some_arg => 5)

Starling Runner

To use the Workling's Starling runner, you need to setup your environment like so:

Workling::Remote.dispatcher = Workling::Remote::Runners::StarlingRunner.new

I add this line to all my environment files (development.rb, etc.). Workling is nice in that if you comment out the above line, all the MyWorker.asynch_* calls will become synchronous calls -- nice for debugging!

The Starling runner takes care of several things:

  1. Mapping of queue names to worker code. this is done with Workling::ClassAndMethodRouting, but you can change the queue routing pretty easily.
  2. There's a client daemon that waits for messages and dispatches these to the responsible workers. if you intend to run this on a remote machine, then just check out your rails project there and start up the Starling client.

Now, fire up Starling, your app, and the workling runner, and your are processing background tasks. Don't forget to edit config/starling.yml first to tell Workling where Starling is running.

sudo starling -d
script/server
script/workling_starling_client start

What I ended up with was much better for what I was doing. This combination processed my background tasks faster and more reliably. It is much easier to add new workers and call them. Finally, it uses a whole lot less memory, so my end user application performs better. Basically, it wins on all fronts for me.

Next time, I will share the changes I made to Workling to support threads and provide the necessary configuration to ensure that everything stays running in production.