TL;DR, All the code can be found
here
Sometimes, when you want complete control, you want to be able to install packages from source and still use an automated tool like Ansible to do that.
A simple set of tasks can check for the existence of files to eliminate the need for running tasks that are already complete but that doesn’t help us with making sure we have the correct version installed.
I’m going to walk through creating a play that will build ruby from source. It will not do any work if ruby is already installed and is already the correct version. If not correct, it will:
- download the source tarball
- extract the source
- configure the install
- make the build
- install ruby
- cleanup the build directory
A first pass can be found in
this gist
If repeated, this build will re-download the archive, extract it, configure it and make it. It won’t install the binary again because it checks for the existence of the file
/usr/local/bin/ruby
but other than that, all tasks will re-run.
The first step is to create a task that will determine the installed ruby version if present.
- name: Get installed ruby version
command: ruby --version # Run this command
ignore_errors: true # We don’t want and error in this command to cause the task to fail
changed_when: false
failed_when: false
register: ruby_installed_version # Register a variable with the result of the command
This task will run ruby --version
but will silently fail if ruby is not installed. If ruby is installed, then it registers the version string in a variable named ruby_installed_version
.
The next step is to create a variable we can use to test whether to build ruby or not. This is set in our global_vars to a default of false. Then add a task that will set that variable to true if the version string doesn’t match.
- name: Force install if the version numbers do not match
set_fact:
ruby_reinstall_from_source: true
when: '(ruby_installed_version|success and (ruby_installed_version.stdout | regex_replace("^.*?([0-9\.]+).*$", "\\1") | version_compare(ruby_version, operator="!=")))'
Now we can add a when
clause to all our other tasks. This will skip the task if ruby is correctly installed. That can be seen in
this gist
The when clause checks for two things, (1) the task which checked the ruby version failed (i.e. there is no ruby installed) or (2) the
ruby_reinstall_from_source
variable is true (i.e. the versions don’t match).
An example task with the when clause:
- name: Download Ruby
when: ruby_installed_version|failed or ruby_reinstall_from_source
get_url:
url: "https://cache.ruby-lang.org/pub/ruby/2.3/ruby-{{ruby_version}}.tar.gz"
dest: "/tmp/ruby-{{ruby_version}}.tar.gz"
sha256sum: "{{ruby_sha256sum}}"
# …
We now have a conditional on every test. That seems a bit redundant. This can be improved by using the
block
syntax. By using a block we can check the condition once, and then run or skip the whole installation in one move.
- when: ruby_installed_version|failed or ruby_reinstall_from_source
block:
- name: Download Ruby
when: ruby_installed_version|failed or ruby_reinstall_from_source
get_url:
url: "https://cache.ruby-lang.org/pub/ruby/2.3/ruby-{{ruby_version}}.tar.gz"
dest: "/tmp/ruby-{{ruby_version}}.tar.gz"
sha256sum: "{{ruby_sha256sum}}"
# …
The final code can be found in this gist, https://gist.github.com/andrewtimberlake/802bd8d285b3e18c5ebe, where you can walk through the three revisions as outlined in the article.
A quick tip to make it easier to use Dead Man's Snitch with the whenever gem
Whenever
is a great gem for managing cron jobs.
Dead Man’s Snitch
is a fantastic and useful tool for making sure those cron jobs actually run when they should.
Whenever includes a number of predefined job types which can be overridden to include snitch support.
The job_type
command allows you to register a job type. It takes a name and a string representing the command. Within the command string, anything that begins with
:
is replaced with the value from the jobs options hash. Sounds complicated but is in fact quite easy.
Include the whenever
gem in your Gemfile and then run
$ bundle exec wheneverize
This will create a file, config/schedule.rb
. Insert these lines at the top of your config file, I have mine just below set :output
.
These lines add && curl https://nosnch.in/:snitch
to each job type just before :output
.
job_type :command, "cd :path && :task && curl https://nosnch.in/:snitch :output"
job_type :rake, "cd :path && :environment_variable=:environment bin/rake :task --silent && curl https://nosnch.in/:snitch :output"
job_type :runner, "cd :path && bin/rails runner -e :environment ':task' && curl https://nosnch.in/:snitch :output"
job_type :script, "cd :path && :environment_variable=:environment bundle exec script/:task && curl https://nosnch.in/:snitch :output"
Now add your job to the schedule. A simple rake task would like this:
every 1.day, roles: [:app] do
rake "log:clear"
end
Now it’s time to create the snitch. You can grab a free account at
deadmanssnitch.com
and add a new snitch.
Then, once that’s saved, you’ll see a screen with your snitch URL. All you need to do is copy the hex code at the end.
Use that hex code in your whenever job as follows:
every 1.day, roles: [:app] do
rake "log:clear", snitch: "06ebef375f"
end
Now deploy and update your whenverized cron job. DMS
will let you know as soon as your job runs for the first time so you know it has begun to work. After that, they’ll only let you know if it fails to check in.
Tip:
For best tracking, you want your DMS
job to check in just before the end of the period you’re monitoring (in the above example 1 day). To do that, I revert to cron syntax in whenever and set my job up as:
# Assuming your server time zone is set to UTC
every "59 23 * * *", roles: [:app] do
rake "log:clear", snitch: "06ebef375f"
end
See Does it matter when I ping a snitch?. Remember to allow time for the job to run and complete.
For more information, read through the full DMS FAQ
I’ve found a number of times where I have needed to iterate over a hash and modify the values. The most recent was stripping excess spaces from the values of a Rails params hash.
The only way I know of doing this is:
hash = {one: " one ", two: "two "}
hash.each do |key, value|
hash[key] = value.strip!
end
#=> {:one=>“one”, :two=>“two”}
This is a lot less elegant than using map
on an Array
[" one ", "two "].map(&:strip!)
#=> ["one", "two"]
I wanted something like #map
for a Hash
So I came up with Hash#clean
(this is a monkey patch so exercise with caution)
class Hash
def clean(&block)
each { |key, value|
self[key] = yield(value)
}
end
end
Now it’s as easy as,
{one: " one ", two: "two "}.clean(&:strip!)
#=> {:one=>"one", :two=>"two"}
Now I can easily sanitise Rails parameter hashes
def model_params
params.require(:model).permit(:name, :email, :phone).clean(&:strip!)
end
I recently had an import job failing because it took too long. When I had a look at the file I saw that there were 74
useful
lines but a total of 1,044,618
lines in the file (My guess is MS Excel having a little fun with us).
Most of the lines were simply rows of commas:
Row,Of,Headers
some,valid,data
,,
,,
,,
,,
,,
The CSV library has an option named skip_blanks
but the documentation says “Note that this setting will not skip rows that contain column separators, even if the rows contain no actual data”, so that’s not actually helpful in this case.
What is needed is skip_lines
with a regular expression that will match any lines with just column separators (/^(?:,\s*)+$/
).
The resulting code looks like this:
require 'csv'
CSV.foreach('/tmp/tmp.csv',
headers: true,
skip_blanks: true,
skip_lines: /^(?:,\s*)+$/) do |row|
puts row.inspect
end
#<CSV::Row "Row":"some" "Of":"valid" "Headers":"data">
#=> nil
An overview of how Fibers work in Ruby
Fibers are code blocks that can be paused and resumed. They are unlike threads because they never run concurrently. The programmer is in complete control of when a fiber is run. Because of this we can create two fibers and pass control between them.
Control is passed to a fiber when you call Fiber#resume, the Fiber returns control by calling
Fiber.yield
fiber = Fiber.new do
Fiber.yield 'one'
Fiber.yield 'two'
end
puts fiber.resume
#=> one
puts fiber.resume
#=> two
The above example shows the most common use case where Fiber.yield
is passed an argument which is returned through Fiber#resume.
What’s interesting is that you can pass an argument into the fiber via Fiber#resume
as well. The first call to Fiber#resume
starts the fiber and that argument goes to the block that creates the fiber, all subsequent calls to
Fiber#resume
have their arguments passed to Fiber.yield.
fiber = Fiber.new do |arg|
puts arg # prints 'one'
puts Fiber.yield('two') # prints 'three'
puts Fiber.yield('four') # prints 'five'
end
puts fiber.resume('one') # prints 'two'
#=> one
#=> two
puts fiber.resume('three') # prints 'four'
#=> three
#=> four
puts fiber.resume('five') # prints nil because there's no corresponding yield and the fiber exits
#=> nil
Armed with this information, we can setup two fibers and get them to communicate between each other.
require 'fiber'
fiber2 = nil
fiber1 = Fiber.new do
puts fiber2.resume # start fiber2 and print first result (1)
puts fiber2.resume 2 # send second number and print second result (3)
fiber2.resume 4 # send forth number, print nothing and exit
end
fiber2 = Fiber.new do
puts Fiber.yield 1 # send first number and print returned result (2)
puts Fiber.yield 3 # send third number, print returned result (4) and exit
end
fiber1.resume # start fiber1
#=> 1
#=> 2
#=> 3
#=> 4
puts "fiber1 done" unless fiber1.alive?
#=> fiber1 done
puts "fiber2 done" unless fiber2.alive?
#=> fiber2 done
EachGroup module
Knowing we can send information between two fibers with alternating calls of
Fiber#resume
and Fiber.yield, we have the building blocks to tackle a streaming #each_group method.
Tip:
The fiber you first call #resume
on should always call #resume
on the fiber it is communicating with. The other thread then always calls Fiber.yield. This goes against the natural inclination to pass information with
Fiber.yield
as in the first example above. Because of how the two fibers are setup below, you’ll see that no information is passed with Fiber.yield, information is only passed using
Fiber#resume
—confusing, I know.
# -*- coding: utf-8 -*-
require 'fiber'
module EachGroup
def each_group(*fields, &block)
grouper = Grouper.new(*fields, &block)
loop_fiber = Fiber.new do
each do |result|
grouper.process_result(result)
end
end
loop_fiber.resume
end
class Grouper
def initialize(*fields, &block)
@current_group = nil
@fields = fields
@block = block
end
attr_reader :fields, :block
attr_accessor :current_group
def process_result(result)
group_fiber = get_group_fiber(result)
group_fiber.resume(result) if group_fiber.alive?
end
private
def get_group_fiber(result)
group_value = fields.map{|f| result.public_send(f) }
unless current_group == group_value
self.current_group = group_value
create_group_fiber(result, group_value)
end
@group_fiber
end
def create_group_fiber(result, group_value)
@group_fiber = Fiber.new do |first_result|
group = Group.new(group_value)
block.call(group)
end
@group_fiber.resume(nil) # Start the fiber and wait for its first yield
end
end
class Group
def initialize(value)
@value = value
end
attr_reader :value
def each(&block)
while result = Fiber.yield
block.call(result)
end
end
end
end
Example Usage
#each_group requires input sorted for grouping.
require 'each_group'
require 'ostruct'
Array.send(:include, EachGroup)
array = [
OpenStruct.new(year: 2014, month: 1, date: 1),
OpenStruct.new(year: 2014, month: 1, date: 3),
OpenStruct.new(year: 2014, month: 2, date: 5),
OpenStruct.new(year: 2014, month: 2, date: 7),
]
array.each_group(:year, :month) do |group|
puts group.value.inspect
group.each do |obj|
puts " #{obj.date}"
end
end
#=> [2014, 1]
#=> 1
#=> 3
#=> [2014, 2]
#=> 5
#=> 7
This code can be used with ActiveRecord as follows:
ActiveRecord::Relation.send(:include, EachGroup)
Model.order('year, month').each_group do |group|
group.each do
# ...
end
end
I have uploaded a Gist
that shows a previous iteration of the EachGroup module using a nested loop which you may find easier to use to understand how the fibers are used to control the flow of the loop.
-
The above code with a RSpec spec -
https://gist.github.com/andrewtimberlake/9462561
-
The original code with nested loops -
https://gist.github.com/andrewtimberlake/9462561/f0e88cd310614a34693d57c3fc759f5c78e3a264
Thanks for taking the time to read through this. Explaining complicated concepts like Fibers is a challenge, please leave a comment and let me know if this was helpful or if you still have any questions.
I’m working on an app that creates user accounts and (optionally) subscribes users to our mailing list. Because I’m handling user creation in my app, I need some way to add them to the mailing list which is hosted on MailChimp. To do this, I am using their
API
to send through subscriber information.
The documentation for the ruby gem is not great. You have a few choices:
Below is some sample code that will get you started.
Install the mailchimp-api gem
> gem install mailchimp-api
# or
> echo 'gem "mailchimp-api", require: false' >> Gemfile
> bundle install
Get your MailChimp API Key
In MailChimp, go to your account settings
page, click Extras
and API Keys. If you don’t have an API key yet, click Create A Key.
Get your MailChimp list ID
Every list has a unique ID which is needed to add subscribers to the correct list. Got to Lists, Click on your list name, Click
Settings
and List name & defaults. On the right you’ll see your List ID (a 10 character hex code).
The code
require 'mailchimp' # The gem name is mailchimp-api but you require mailchimp
module MailChimpSubscription
# These should prabably be environment variables or configuration variables
MAIL_CHIMP_API_KEY = "0000000001234567890_us1"
MAIL_CHIMP_LIST_ID = "abcdef1234"
extend self
def subscribe(user)
mail_chimp.lists.subscribe(MAIL_CHIMP_LIST_ID,
# The email field is a struct that can use an
# email address or two MailChimp specific list ids (see API docs)
{email: user.email},
# Set your merge vars here
{'FNAME' => user.first_name, 'LNAME' => user.last_name})
rescue Mailchimp::ListAlreadySubscribedError
# Decide what to do if the user is already subscribed
rescue Mailchimp::ListDoesNotExistError => e
# This is definitely a problem I want to know about
raise e
rescue Mailchimp::Error => e
# Unforeseen errors that need to be dealt with
end
private
def mail_chimp
@mail_chimp ||= Mailchimp::API.new(MAIL_CHIMP_API_KEY)
end
end
To use this module, you pass in a user object that responds to #email, #first_name and #last_name
user = OpenStruct.new(email: 'test@example.com', first_name: 'John', last_name: 'Doe')
MailChimpSubscription.subscribe(user)
Final thoughts
It’s probably a good idea to put mailing list subscription into a background job so that you don’t slow down your user creation response time. You can also handle transient errors, retry failed attempts etc.
Installing Middleman
Adding extensions
middleman-blog
middleman-syntax
redcarpet
Github source code coloring
wget https://github.com/richleland/pygments-css/raw/master/github.css
def some_code
end
I got a message from a client this morning telling me that all users could see all reports on our product. Not good. I use CanCan to manage permissions and until now it has served me well. What went wrong? Whether a bug or not, I discovered that a very recent change I made had openned up the hole.
I wanted to have a permission setting that could prevent anyone from seeing any reports as well as more fine grained control over each individual report. My permissions looked a bit like this:
class Ability
def initialize(user)
can :read, Reports
can :read, Reports::ReportA
end
end
When checking permissions for another report within the module, I didn’t expect this:
module Reports
class ReportBController
def show
authorize! :read, Reports::ReportB #=> I assumed it would not be authorized but it is
...
end
end
end
What I didn’t expect is that when you authorise a module, all classes in that namespace are authorised as well.
As I mentioned above, I don’t know if this is by design or not. Some quick googling didn’t help me so I changed my code for a quick solution.
I post this to warn others who may have made the same assumption.
If you’re reading this and know the project better and can point out if it is a bug or feature, please let me know in the comments.