Andrew Timberlake Andrew Timberlake

Hi, I’m Andrew, a programer and entrepreneur from South Africa, founder of Sitesure for monitoring websites, APIs, and background jobs.
Thanks for visiting and reading.


Install from source using Ansible

TL;DR, All the code can be found here

Sometimes, when you want complete control, you want to be able to install packages from source and still use an automated tool like Ansible to do that.

A simple set of tasks can check for the existence of files to eliminate the need for running tasks that are already complete but that doesn’t help us with making sure we have the correct version installed.

I’m going to walk through creating a play that will build ruby from source. It will not do any work if ruby is already installed and is already the correct version. If not correct, it will:

A first pass can be found in this gist If repeated, this build will re-download the archive, extract it, configure it and make it. It won’t install the binary again because it checks for the existence of the file /usr/local/bin/ruby but other than that, all tasks will re-run.

The first step is to create a task that will determine the installed ruby version if present.

- name: Get installed ruby version
  command: ruby --version  # Run this command
  ignore_errors: true  # We don’t want and error in this command to cause the task to fail
  changed_when: false
  failed_when: false
  register: ruby_installed_version  # Register a variable with the result of the command

This task will run ruby --version but will silently fail if ruby is not installed. If ruby is installed, then it registers the version string in a variable named ruby_installed_version.

The next step is to create a variable we can use to test whether to build ruby or not. This is set in our global_vars to a default of false. Then add a task that will set that variable to true if the version string doesn’t match.

- name: Force install if the version numbers do not match
  set_fact:
    ruby_reinstall_from_source: true
  when: '(ruby_installed_version|success and (ruby_installed_version.stdout | regex_replace("^.*?([0-9\.]+).*$", "\\1") | version_compare(ruby_version, operator="!=")))'

Now we can add a when clause to all our other tasks. This will skip the task if ruby is correctly installed. That can be seen in this gist

The when clause checks for two things, (1) the task which checked the ruby version failed (i.e. there is no ruby installed) or (2) the ruby_reinstall_from_source variable is true (i.e. the versions don’t match).

An example task with the when clause:

- name: Download Ruby
  when: ruby_installed_version|failed or ruby_reinstall_from_source
  get_url:
    url: "https://cache.ruby-lang.org/pub/ruby/2.3/ruby-{{ruby_version}}.tar.gz"
    dest: "/tmp/ruby-{{ruby_version}}.tar.gz"
    sha256sum: "{{ruby_sha256sum}}"

  # …

We now have a conditional on every test. That seems a bit redundant. This can be improved by using the block syntax. By using a block we can check the condition once, and then run or skip the whole installation in one move.

- when: ruby_installed_version|failed or ruby_reinstall_from_source
  block:
    - name: Download Ruby
      when: ruby_installed_version|failed or ruby_reinstall_from_source
      get_url:
        url: "https://cache.ruby-lang.org/pub/ruby/2.3/ruby-{{ruby_version}}.tar.gz"
        dest: "/tmp/ruby-{{ruby_version}}.tar.gz"
        sha256sum: "{{ruby_sha256sum}}"

    # …

The final code can be found in this gist, https://gist.github.com/andrewtimberlake/802bd8d285b3e18c5ebe, where you can walk through the three revisions as outlined in the article.

21 Mar 2016

Using Dead Man's Snitch with Whenever

A quick tip to make it easier to use Dead Man's Snitch with the whenever gem

Whenever is a great gem for managing cron jobs. Dead Man’s Snitch is a fantastic and useful tool for making sure those cron jobs actually run when they should.

Whenever includes a number of predefined job types which can be overridden to include snitch support.

The job_type command allows you to register a job type. It takes a name and a string representing the command. Within the command string, anything that begins with : is replaced with the value from the jobs options hash. Sounds complicated but is in fact quite easy.

Include the whenever gem in your Gemfile and then run

$ bundle exec wheneverize

This will create a file, config/schedule.rb. Insert these lines at the top of your config file, I have mine just below set :output.

These lines add && curl https://nosnch.in/:snitch to each job type just before :output.

job_type :command,   "cd :path && :task && curl https://nosnch.in/:snitch :output"
job_type :rake,      "cd :path && :environment_variable=:environment bin/rake :task --silent && curl https://nosnch.in/:snitch :output"
job_type :runner,    "cd :path && bin/rails runner -e :environment ':task' && curl https://nosnch.in/:snitch :output"
job_type :script,    "cd :path && :environment_variable=:environment bundle exec script/:task && curl https://nosnch.in/:snitch :output"

Now add your job to the schedule. A simple rake task would like this:

every 1.day, roles: [:app] do
  rake "log:clear"
end

Now it’s time to create the snitch. You can grab a free account at deadmanssnitch.com and add a new snitch.

New Snitch

Then, once that’s saved, you’ll see a screen with your snitch URL. All you need to do is copy the hex code at the end.

Snitch URL

Use that hex code in your whenever job as follows:

every 1.day, roles: [:app] do
  rake "log:clear", snitch: "06ebef375f"
end

Now deploy and update your whenverized cron job. DMS will let you know as soon as your job runs for the first time so you know it has begun to work. After that, they’ll only let you know if it fails to check in.

Tip: For best tracking, you want your DMS job to check in just before the end of the period you’re monitoring (in the above example 1 day). To do that, I revert to cron syntax in whenever and set my job up as:

# Assuming your server time zone is set to UTC
every "59 23 * * *", roles: [:app] do
  rake "log:clear", snitch: "06ebef375f"
end

See Does it matter when I ping a snitch?. Remember to allow time for the job to run and complete. For more information, read through the full DMS FAQ

6 Sep 2015

Cleaning up a Ruby hash

I’ve found a number of times where I have needed to iterate over a hash and modify the values. The most recent was stripping excess spaces from the values of a Rails params hash.

The only way I know of doing this is:

hash = {one: "  one  ", two: "two  "}
hash.each do |key, value|
  hash[key] = value.strip!
end
#=> {:one=>“one”, :two=>“two”}

This is a lot less elegant than using map on an Array

["  one  ", "two  "].map(&:strip!)
#=> ["one", "two"]

I wanted something like #map for a Hash

So I came up with Hash#clean (this is a monkey patch so exercise with caution)

class Hash
  def clean(&block)
    each { |key, value|
      self[key] = yield(value)
    }
  end
end

Now it’s as easy as,

{one: "  one  ", two: "two  "}.clean(&:strip!)
#=> {:one=>"one", :two=>"two"}

Now I can easily sanitise Rails parameter hashes

def model_params
  params.require(:model).permit(:name, :email, :phone).clean(&:strip!)
end
30 Aug 2015

Skipping blank lines in ruby CSV parsing

I recently had an import job failing because it took too long. When I had a look at the file I saw that there were 74 useful lines but a total of 1,044,618 lines in the file (My guess is MS Excel having a little fun with us).

Most of the lines were simply rows of commas:

Row,Of,Headers
some,valid,data
,,
,,
,,
,,
,,

The CSV library has an option named skip_blanks but the documentation says “Note that this setting will not skip rows that contain column separators, even if the rows contain no actual data”, so that’s not actually helpful in this case.

What is needed is skip_lines with a regular expression that will match any lines with just column separators (/^(?:,\s*)+$/). The resulting code looks like this:

require 'csv'
CSV.foreach('/tmp/tmp.csv',
            headers: true,
            skip_blanks: true,
            skip_lines: /^(?:,\s*)+$/) do |row|
  puts row.inspect
end

#<CSV::Row "Row":"some" "Of":"valid" "Headers":"data">
#=> nil
12 Jul 2015

Looping with Fibers

An overview of how Fibers work in Ruby

Fibers are code blocks that can be paused and resumed. They are unlike threads because they never run concurrently. The programmer is in complete control of when a fiber is run. Because of this we can create two fibers and pass control between them.

Control is passed to a fiber when you call Fiber#resume, the Fiber returns control by calling Fiber.yield

fiber = Fiber.new do
  Fiber.yield 'one'
  Fiber.yield 'two'
end
puts fiber.resume
#=> one
puts fiber.resume
#=> two

The above example shows the most common use case where Fiber.yield is passed an argument which is returned through Fiber#resume. What’s interesting is that you can pass an argument into the fiber via Fiber#resume as well. The first call to Fiber#resume starts the fiber and that argument goes to the block that creates the fiber, all subsequent calls to Fiber#resume have their arguments passed to Fiber.yield.

fiber = Fiber.new do |arg|
  puts arg                   # prints 'one'
  puts Fiber.yield('two')    # prints 'three'
  puts Fiber.yield('four')   # prints 'five'
end
puts fiber.resume('one')     # prints 'two'
#=> one
#=> two
puts fiber.resume('three')   # prints 'four'
#=> three
#=> four
puts fiber.resume('five')    # prints nil because there's no corresponding yield and the fiber exits
#=> nil

Armed with this information, we can setup two fibers and get them to communicate between each other.

require 'fiber'

fiber2 = nil
fiber1 = Fiber.new do
  puts fiber2.resume     # start fiber2 and print first result (1)
  puts fiber2.resume 2   # send second number and print second result (3)
  fiber2.resume 4        # send forth number, print nothing and exit
end
fiber2 = Fiber.new do
  puts Fiber.yield 1     # send first number and print returned result (2)
  puts Fiber.yield 3     # send third number, print returned result (4) and exit
end
fiber1.resume            # start fiber1
#=> 1
#=> 2
#=> 3
#=> 4
puts "fiber1 done" unless fiber1.alive?
#=> fiber1 done
puts "fiber2 done" unless fiber2.alive?
#=> fiber2 done

EachGroup module

Knowing we can send information between two fibers with alternating calls of Fiber#resume and Fiber.yield, we have the building blocks to tackle a streaming #each_group method. Tip: The fiber you first call #resume on should always call #resume on the fiber it is communicating with. The other thread then always calls Fiber.yield. This goes against the natural inclination to pass information with Fiber.yield as in the first example above. Because of how the two fibers are setup below, you’ll see that no information is passed with Fiber.yield, information is only passed using Fiber#resume —confusing, I know.

# -*- coding: utf-8 -*-
require 'fiber'

module EachGroup
  def each_group(*fields, &block)
    grouper = Grouper.new(*fields, &block)
    loop_fiber = Fiber.new do
      each do |result|
        grouper.process_result(result)
      end
    end
    loop_fiber.resume
  end

  class Grouper
    def initialize(*fields, &block)
      @current_group = nil
      @fields = fields
      @block = block
    end
    attr_reader :fields, :block
    attr_accessor :current_group

    def process_result(result)
      group_fiber = get_group_fiber(result)
      group_fiber.resume(result) if group_fiber.alive?
    end

    private
    def get_group_fiber(result)
      group_value = fields.map{|f| result.public_send(f) }
      unless current_group == group_value
        self.current_group = group_value
        create_group_fiber(result, group_value)
      end
      @group_fiber
    end

    def create_group_fiber(result, group_value)
      @group_fiber = Fiber.new do |first_result|
        group = Group.new(group_value)
        block.call(group)
      end
      @group_fiber.resume(nil) # Start the fiber and wait for its first yield
    end
  end

  class Group
    def initialize(value)
      @value = value
    end
    attr_reader :value

    def each(&block)
      while result = Fiber.yield
        block.call(result)
      end
    end
  end
end

Example Usage

#each_group requires input sorted for grouping.

require 'each_group'
require 'ostruct'

Array.send(:include, EachGroup)

array = [
  OpenStruct.new(year: 2014, month: 1, date: 1),
  OpenStruct.new(year: 2014, month: 1, date: 3),
  OpenStruct.new(year: 2014, month: 2, date: 5),
  OpenStruct.new(year: 2014, month: 2, date: 7),
]
array.each_group(:year, :month) do |group|
  puts group.value.inspect
  group.each do |obj|
    puts "  #{obj.date}"
  end
end
#=> [2014, 1]
#=>   1
#=>   3
#=> [2014, 2]
#=>   5
#=>   7

This code can be used with ActiveRecord as follows:

ActiveRecord::Relation.send(:include, EachGroup)

Model.order('year, month').each_group do |group|
  group.each do
    # ...
  end
end

I have uploaded a Gist that shows a previous iteration of the EachGroup module using a nested loop which you may find easier to use to understand how the fibers are used to control the flow of the loop.

  1. The above code with a RSpec spec - https://gist.github.com/andrewtimberlake/9462561
  2. The original code with nested loops - https://gist.github.com/andrewtimberlake/9462561/f0e88cd310614a34693d57c3fc759f5c78e3a264

Thanks for taking the time to read through this. Explaining complicated concepts like Fibers is a challenge, please leave a comment and let me know if this was helpful or if you still have any questions.

9 Mar 2014

How to Add Subscribers to a MailChimp List With Ruby

I’m working on an app that creates user accounts and (optionally) subscribes users to our mailing list. Because I’m handling user creation in my app, I need some way to add them to the mailing list which is hosted on MailChimp. To do this, I am using their API to send through subscriber information.

The documentation for the ruby gem is not great. You have a few choices:

Below is some sample code that will get you started.

Install the mailchimp-api gem

> gem install mailchimp-api
# or
> echo 'gem "mailchimp-api", require: false' >> Gemfile
> bundle install

Get your MailChimp API Key

In MailChimp, go to your account settings page, click Extras and API Keys. If you don’t have an API key yet, click Create A Key.

Get your MailChimp list ID

Every list has a unique ID which is needed to add subscribers to the correct list. Got to Lists, Click on your list name, Click Settings and List name & defaults. On the right you’ll see your List ID (a 10 character hex code).

The code

require 'mailchimp' # The gem name is mailchimp-api but you require mailchimp

module MailChimpSubscription
  # These should prabably be environment variables or configuration variables
  MAIL_CHIMP_API_KEY = "0000000001234567890_us1"
  MAIL_CHIMP_LIST_ID = "abcdef1234"
  extend self

  def subscribe(user)
    mail_chimp.lists.subscribe(MAIL_CHIMP_LIST_ID,
                               # The email field is a struct that can use an
                               #    email address or two MailChimp specific list ids (see API docs)
                               {email: user.email},
                               # Set your merge vars here
                               {'FNAME' => user.first_name, 'LNAME' => user.last_name})
    rescue Mailchimp::ListAlreadySubscribedError
      # Decide what to do if the user is already subscribed
    rescue Mailchimp::ListDoesNotExistError => e
      # This is definitely a problem I want to know about
      raise e
    rescue Mailchimp::Error => e
      # Unforeseen errors that need to be dealt with
  end

  private
  def mail_chimp
    @mail_chimp ||= Mailchimp::API.new(MAIL_CHIMP_API_KEY)
  end
end

To use this module, you pass in a user object that responds to #email, #first_name and #last_name

user = OpenStruct.new(email: 'test@example.com', first_name: 'John', last_name: 'Doe')
MailChimpSubscription.subscribe(user)

Final thoughts

It’s probably a good idea to put mailing list subscription into a background job so that you don’t slow down your user creation response time. You can also handle transient errors, retry failed attempts etc.

11 Feb 2014

Building my blog in Middleman

Installing Middleman

Adding extensions

middleman-blog middleman-syntax redcarpet

Github source code coloring

wget https://github.com/richleland/pygments-css/raw/master/github.css
def some_code
end
9 Dec 2013

Potential security hole authorising modules in CanCan

I got a message from a client this morning telling me that all users could see all reports on our product. Not good. I use CanCan to manage permissions and until now it has served me well. What went wrong? Whether a bug or not, I discovered that a very recent change I made had openned up the hole.

I wanted to have a permission setting that could prevent anyone from seeing any reports as well as more fine grained control over each individual report. My permissions looked a bit like this:

class Ability
  def initialize(user)
    can :read, Reports
    can :read, Reports::ReportA
  end
end

When checking permissions for another report within the module, I didn’t expect this:

module Reports
  class ReportBController
    def show
      authorize! :read, Reports::ReportB #=> I assumed it would not be authorized but it is
      ...
    end
  end
end

What I didn’t expect is that when you authorise a module, all classes in that namespace are authorised as well. As I mentioned above, I don’t know if this is by design or not. Some quick googling didn’t help me so I changed my code for a quick solution.

I post this to warn others who may have made the same assumption. If you’re reading this and know the project better and can point out if it is a bug or feature, please let me know in the comments.

6 Nov 2013

How to protect downloads but still have nginx serve the files

I’ve just been working on a project where a number of downloads needed to be restricted to specific users. I needed to authenticate the user and then allow them access to the file. This is not too difficult in rails:

def download
  if authenticated?
    send_file #{RAILS_ROOT}/downloads/images/myfile.zip'
  end
end

The problem with this is that if the file is large, rails will spend a lot of time sending this file to the browser. The solution, hand it off to the webserver (in my case, nginx) to send the file once the authentication has succeeded. nginx supports a header named X-Accel-Redirect. Using this header, you send a full path to the file to be downloaded:

def download
  if authenticated?
    #Set the X-Accel-Redirect header with the path relative to the /downloads location in nginx
    response.headers['X-Accel-Redirect'] = '/downloads/myfile.zip'
    #Set the Content-Type header as nginx won't change it and Rails will send text/html
    response.headers['Content-Type'] = 'application/octet-stream'
    #If you want to force download, set the Content-Disposition header (which nginx won't change)
    response.headers['Content-Disposition'] = 'attachment; filename=myfile.zip'
    #Make sure we don't render anything
    render :nothing => true
  end
end

You will need to add a location directive in nginx marked as internal which nginx will use along with your path to get to the physical file.

location /downloads {
  root /rails_deploy/current/downloads;
  #Marked internal so that this location cannot be accessed directly.
  internal;
}

Notes:

You can also set additional control using the following headers:

X-Accel-Limit-Rate: 1024
X-Accel-Buffering: yes|no
X-Accel-Charset: utf-8

See the nginx documentation on X-Accel-Redirect for more information.

1 Dec 2010