Archive

Archive for the ‘Rails’ Category

Rails Unified Application Logging with Log Weaver

February 23rd, 2010

The Problem

After adding our third app server a couple days ago, the appeal of digging through three separate production.log files when things go awry on Bonanzle was officially over.

Like many Rails developers in this situation, I Googled numerous terms in search of a solution, and most of these terms sent me to my good friend Jesse Proudman’s blog on using Syslog-ng to unify Rails logfiles.  So we installed it (well, to be specific, we had Blue Box set it up for us, because it looked complicated), and determined that it was not what we were looking for.  Installation issues aside (of which there were a few), the real killer when using Syslogger with Rails is that you lose the buffered production log output you have come to know and love, leaving your production logfiles a stew of mishmashed lines from numerous Passenger process in numerous states of processing.  In short, if you get an appreciable amount of traffic (and I’d imagine you do if you’re reading this in the first place), and you are a human being, you will not be able to read an unbuffered Rails log without considerable time and frustration.

The Solution

Exists on Github here.

Since it looked like Syslog was the only game in town currently for merging application server logs, I decided to spend the afternoon writing a plugin that would allow us to take an arbitrary number of production logfiles, from an arbitrary number of hosts, and merge them together into one file without changing the formatting of the production logs or affecting performance on the app servers.

The basic mechanism of this plugin is that it uses rsync to grab your production logs, then it boils those production logs down into hashes of { :time => action_time, :text => action text }.  It then outputs the actions from all of your app servers into a single file in chronological order.

As a bonus, it also lets you specify the maximum size of your unified log file, and handles keeping the logfiles broken into bite-sized chunks, so that you can actually read the output afterwards (rather than ending up with a 5GB log file).  This functionality is built in, and can be configured via an included YML file.

The remainder of this post will just quote from the Github project, which does a pretty fine job of explaining what’s going on:

Log Weaver Functionality
========================
Log Weaver v0.2 sports the following featureset:

* Sync as many production logfiles, from as many hosts as you want, into a single logfile on a single server
* Use your existing Rails production.log, no need to modify how it’s formatted
* Break up your unified log file into bite sized chunks that you specify from a YML file, so you don’t end up with a 10 GB unified logfile that you can’t open or use
* Does not run on or adversely affect performance of app servers. Uses rsync to grab production log files from app servers, then does the “hard work” of combining them on what is presumably a separate server.

Installation
============
Clone the log-weaver github project into your vendor/plugins directory. No models, database tables, or installation is needed for this plugin. Simply edit the /log-weaver/config/system_logs.yml file to specify the settings of your hosts.

Usage
=====
Run “rake log_weaver:weave_logs” to initiate the process of log merging.

When you run this task, Log Weaver will rsync your logfiles from the locations you specified in the YML file. The first time you run Log Weaver, this might take a minute or two, depending on how big your production logs are. On subsequent runs, it should be pretty instanteous, since rsync is smart about merging files.

After rsyncing your production logs from your app servers, Log Weaver will build the production logs into bite-sized hashes and merge them in chronological order into the file that you specified in your YML file. Limited testing has shown that combining three logfiles that were each 1+ GB used less than 300MB of memory and completed in about 10 minutes.

Run “rake log_weaver:weave_logs” periodically to add to your unified log. When the size of your unified log exceeds the size you specified in the settings YML file, the unified log will be renamed “[log_name]_X.log” where X is the lowest integer of a file that doesn’t exist in your log directory. That is, if you named your file “unified.log,” Log Weaver would move the original log to “unified_2.log” and then open a new ”unified.log” file to continue merging your logs.

Future Improvements
===================
Log Weaver was written over the course of a few hours to fit the baseline needs of Bonanzle, there is surely plenty of room to improve! For starters, this would probably make more sense as a gem than a Rails plugin.

Feel free to fork and add whatever you think it needs and ping me to pull in your improvements and we can make this plugin a worthwhile thing.

Bill Rails

Savage Beast 2.3, a Rails Message Forum Plugin

November 5th, 2009

Savage Beast 2.0 has been the de facto solution for those looking to add a message forum to their existing Rails site, but it was created more than a year ago, and had many aspects that tied it to Rails 2.0. Also, it relied on the Engines plugin, which is not the most lightweight plugin. Although Engines doesn’t seem to affect performance, it did rub some people the wrong way.

After a year’s worth of promises that an update was “coming soon,” an update has finally arrived and is now available at Github.

Detailed instructions on getting it rolling with Rails 2.3 follow.

Installation

Currently, the following is necessary to use the Savage Beast plugin:

  1. The Savage Beast 2.3 plugin. Go to your application’s root directory and:
    script/plugin install git://github.com/wbharding/savage-beast.git
  2. Most of the stuff you need to run Beast…
    • Redcloth: gem install Redcloth. Make sure you add “config.gem 'RedCloth'” inside your environment.rb, so that it gets included.
    • A bunch of plugins (white_list, white_list_formatted_content, acts_as_list, gibberish, will_paginate). If you’re using Rails 2.2 or earlier, you’ll need the Engines plugin, if you’re on Rails 2.3, you don’t need Engines. The easiest way to install these en masse is just to copy the contents of savage_beast/tested_plugins to your standard Rails plugin directory (/vendor/plugins). If you already have versions of these plugins, you can just choose not to overwrite those versions
  3. Go to your application’s root directory and run “rake savage_beast:bootstrap_db” to create the database tables used by Savage Beast.  If it happens you already have tables in your project with the names Savage Beast wants to use, your tables won’t be overwritten (though obviously SB won’t work without its tables).  To see the tables Savage Beast uses, look in lib/tasks/savage_beast.rake in your Savage Beast plugin folder.
  4. Next run “rake savage_beast:bootstrap_assets” to copy Savage Beast stylesheets and images to savage_beast asset subdirectories within your public directory.
  5. Implement in your User model the four methods in plugins/savage_beast/lib/savage_beast/user_init that are marked as "#implement in your user model
  6. Add the line “include SavageBeast::UserInit” to your User model. Location shouldn’t matter unless you intend to override it.
  7. Add the line “include SavageBeast::ApplicationHelper” to ApplicationHelper within your application_helper.rb file.
  8. Implement versions of the methods in SavageBeast::AuthenticationSystem (located in /plugins/savage_beast/lib) in your application controller if they aren’t already there (note: technically, I believe only “login_required” and “current_user” are necessary, the others give you more functionality). Helpful commenter Adam says that if you have the “helper :all” line in your application controller, be sure to add the “SavageBeast::AuthenticationSystem” line after that.

If you’re using Rails 2.0-2.2, and thus using the Engines plugin, you’ll need a couple extra steps:

  1. Add this line to the top of your environment.rb, right after the require of boot: require File.join(File.dirname(__FILE__), ../vendor/plugins/engines/boot‘)
  2. Move the routes.rb file from the “savage-beast/config” directory to the root (”savage-beast”) directory of the plugin. Then add the line “map.from_plugin :savage_beast” to your routes.rb. Location shouldn’t matter unless you intend to override it.

And off you go! When you visit your_site/forums something should happen. I’ve been creating new forums by visiting /forums/new. There’s probably a hidden admin view somewhere.

Implementing Your Own Views and Controllers

Just create a new file in your /controllers or /views directories with the same name as the file you want to override in Savage Beast. If you just want to override a particular method in a controller, you can do that piecemeal if you just leave your XController empty except for the method you wanted to override (Note:  I know this piecemeal method adding works with the Engines plugin installed, but haven’t tested it without).

If you’re integrating this into an existing site, I’d recommend you start by creating a forums layout page (/app/views/layouts/forums.html.erb). This will give you a taste of how easy it is to selectively override files from the plugin.

Demo

You can check out a (slightly-but-not-too-modified) version of Savage Beast online at Bonanzle. The differences between our version and the version checked into Subversion are 1) addition of topic tagging (users can tag topics to get them removed, etc) 2) recent post list shows posts in unique topics, rather than showing posts from the same topic repeatedly (there’s another blog on here about the SQL I used to do that) and 3) skinning. None of those changes feel intrinsic to what SB is “supposed to do,” which is why they aren’t checked in.

Conclusion

Comments are most welcome. I’ll be checking in changes to the project as I find bugs and improvements in using it, but this is admittedly something I don’t have a lot of spare time to closely follow (see my other entries on the wonders of entrepreneurship). Hopefully others can contribute patches as they find time. If you like the plugin, feel free to stop by Agile Development and give it a rating so that others can find it in the future.

Bill Rails

Rails Slave Database Plugin Comparison & Review

October 12th, 2009

Introduction

Based on the skimpy amount of Google results I get when I look for queries relating to Rails slave database (and/or the best rails slave database plugin), I surmise that not many Rails apps grow to the point of needing slave databases.  But we have.  So I’ve been evaluating the various choices intermittently over the last week, and have arrived at the following understanding of the current slave DB ecosystem:

Masochism

Credibility: Was the first viable Rails DB plugin, used to rule the roost for Google search results. The first result for “rails slave database” still points to a Masochism-based approach.

Pros: Once-high usage means that it is the best documented of the Rails slave plugins.  Seems pretty straightforward to initially setup.

Cons: The author himself has admitted (in comments) that the project has fallen into a bit of a state of disrepair, and apparently it doesn’t play nice with Rails 2.2 and higher.  The github lists multiple monkey patches necessary to get it working.  It only appears to work with one slave DB.

master_slave_adapter

Credibility: It’s currently the most watched slave plugin-related project I can find on github (with about 90 followers).  Also got mentioned in Ruby Inside a couple months ago.  Has been updated in last six months.

Pros: Doesn’t use as much monkey patching to reach its goals, therefore theoretically more stable than other solutions as time passes.

Cons: Appears to only handle a connection to one slave DB.  I’m not sure how many sites grow to the point of needing a slave DB, but then expect to stop growing such that they won’t need multiple slave DBs in the future?  Not us.  There’s also less support here than the other choices for limited use of the slave DB.  This one assumes that you’ll want to use the slave for all SELECTs in the entire app, unless you’ve specifically wrapped it in a block that tells it to use the master.

Db Charmer

Credibility:  Used in production by Scribd.com, which has about 4m uniques.  Development is ongoing.  Builds on acts_as_readonlyable, which has been around quite awhile.

Pros:  Seems to strike a nice balance between the multiple database capabilities of SDP and the lightweight implementation of MSA.  Allows one or more slaves to be declare in a given model, or for a model to use a different database entirely (aka db sharding).  Doesn’t require any proprietary database.yml changes.  Didn’t immediately break anything when I installed it.

Cons:  In first hour of usage, it doesn’t work.  It seems to route most of its functionality through a method called #switch_connection_to, and that method doesn’t do anything (including raise an error) when I try to call it.  It just uses our existing production database rather than a slave.  The documentation for this plugin is currently bordering on “non-existent,” although that is not surprising given that the plugin was only released a couple months ago.  Emailed the plugin’s author a week ago to try to get some more details about it and never heard back.

Seamless Database Pool

CredibilityHighest rated DB plugin on Agile Web Development plugin directory.  Has been updated in last six months.

Pros:  More advertised functionality than any other slave plugin, including failover (if one of your slaves stops working, this plugin will try to use other slaves or your master).  Documentation is comparatively pretty good amongst the slave DB choices, with rdoc available.  Supports multiple slave databases, even allowing weighting of the DBs.  And with the exception of Thinking Sphinx, it has “just worked” since dropping it in.

Cons:  Tried to index Thinking Sphinx and ran into difficulty since this plugin redefines the connection adapter used in database.yml*.  The changes needed to database.yml (which are quite proprietary), make me suspicious that this may also conflict with New Relic (which detects DB plugin in a similar manner to TS).   Would be nice if it provided a way to specify database on a per-model basis, like Db Magic.  Also, would inspire more confidence if this had a Github project to gauge number of people using this.

Conclusion

Unfortunately, working with multiple slave databases in Rails seems to be one of the “wild west” areas of development.  It’s not uninhabited, but there is no go-to solution that seems ready to drop in and work with Rails 2.2 and above.  For those running Rails 2.2+ and looking to use multiple slaves, Db Magic and Seamless Database Pool are the two clear frontrunners.  I like the simpler, model-driven style plus lack of database.yml weirdness of Db Magic.  But I really like the extra functionality of SDP.  At this point, our choice will probably boil down to which one gives us the least hassle to get working, and that appears to be SDP, which worked immediately except for Thinking Sphinx.

I’ll be sure to post updates as I get more familiar with these plugins.  Especially if it looks like there is any intelligent life out there besides me that is attempting to get this working.

Update 10/13:  The more I use SDP, the more I’m getting to like it.  Though I was initially drawn to the Db Magic model-based approach to databases, I now think that the SDP action-based approach might make more sense.  Rationale:  Most of the time when we’re rendering a page, we’ll be using data from models that are deeply connected, i.e., a user has user_settings and extend_user_info models associated with it.  We could end up in hot water if the user model used a slave, while the user_settings used the master and extended_user_info used a different slave, as would be possible with a model-based slave approach.  SDP abstracts away this by ensuring that every SELECT statement in the action will automatically use the same slave database from within your slave pool.

Also, though I didn’t notice it documented at first, SDP is smart enough to know that even if you marked an action to read from the slave pool, if you happen to call an INSERT/UPDATE/DELETE within the action, it will still use the master.

* Thinking Sphinx will still start/stop with SDP, it just won’t index.  Luckily for us, we are already indexing our TS files on a separate machine, so I’ll just setup the database.yml on the TS building machine to not use SDP, which ought to solve the problem for us.  If you know of a way to get TS to index with SDP installed, please do post to the comments below.

Bill Rails, Rants

New Relic Apdex: The Best Reason So Far to Use New Relic

May 7th, 2009

Since we first signed up with New Relic about six months ago, they’ve impressed me with the constant stream of features that they have added to their software on a monthly basis. When we first signed up, they were a pretty vanilla monitoring solution, and impressed me little more than FiveRuns had previously. They basically let you see longest running actions sorted by average time consumed, and they let you see throughput, but beyond that, there was little reason to get excited at the time.

Since then, they’ve been heaping on great additions. First, they added a new view (requested by yours truly, amongst others) that let actions be sorted not just by the average time taken, but by the arguably more important “Time taken * times called,” which tends to give a better bang-per-buck idea of where optimization time should be spent.

They’ve also been rearranging which features are available at which levels, which has made “Silver” level a much more tempting proposition, with both the “Controller Summary” (described last paragraph) and “Transaction Traces,” which allows you to see which specific database calls are taking longest to complete.screenhunter_05-may-07-1504screenhunter_06-may-07-1504screenhunter_07-may-07-15041

But by far my favorite New Relic feature added is their brand new “Apdex” feature. If you’re a busy web programmer or operator, the last thing you want to do is spend time creating subjective criteria to prioritize which parts of your application should be optimized first. You also don’t want to spend time determining when, exactly, an action has become slow enough that it warrants optimization time. Apdex provides a terrific way to answer both of these prickly, subjective questions, and it does it in typical New Relic fashion — with a very coherent and readable graphical interface.

I’ve included some screenshots of the Apdex for one of our slower actions at right.  These show (from top to bottom) the actions in our application; ordered from most to least “dissatisfying,” the performance breakdown of one of our more dissatisfying actions; and the degree to which this action has been dissatisfying today, broken down by hour, and put onto a color coded scale that ranges from “Excellent” (not dissatisfying) down to poor. Apdex measures “dissatisfaction” as a combination of the number of times that a controller action has been “tolerable” (takes 2-8 seconds to complete) and “frustrating” (takes more than 8 seconds to complete).

New Relic is to be commended for tackling an extremely subjective problem (when and where to optimize) and creating a very sensible, objective framework through which to filter that decision.  Bravo, guys.  Now, hopefully after Railsconf they can spend a couple hours running Apdex on their Apdex window, since the rendering time for the window generally falls into their “dissatisfaction” range (greater than 8 seconds) :)  

But I’m more than willing to cut them some slack for an addition this useful (and this new).

Bill Rails , , ,

Rails Ajax Image Uploading Made Simple with jQuery

April 15th, 2009

Last week, as part of getting Bloggity rolling with the key features of Wordpress, I realized that we needed to allow the user to upload images without doing a page reload.  Expecting a task as ordinary as this would be well covered by Google, I dutifully set out in search of “rails ajax uploading” and found a bunch of pages that either provided code that simply didn’t work, or claims that it couldn’t be done without a Rails plugin.

Not so.  If you use jQuery and the jQuery-form plugin.

The main challenge in getting a AJAX uploading working is that the standard remote_form_for doesn’t understand multipart form submission, so it’s not going to send the file data Rails seeks back with the AJAX request.   That’s where the jQuery form plugin comes into play.  Here’s the Rails code for it:

<% remote_form_for(:image_form, :url => { :controller => "blogs", :action => :create_asset }, :html => { :method => :post, :id => 'uploadForm', :multipart => true }) do |f| %>
 Upload a file: <%= f.file_field :uploaded_data %>
<% end %>

Here’s the associated Javascript:

$('#uploadForm input').change(function(){
 $(this).parent().ajaxSubmit({
  beforeSubmit: function(a,f,o) {
   o.dataType = 'json';
  },
  complete: function(XMLHttpRequest, textStatus) {
   // XMLHttpRequest.responseText will contain the URL of the uploaded image.
   // Put it in an image element you create, or do with it what you will.
   // For example, if you have an image elemtn with id "my_image", then
   //  $('#my_image').attr('src', XMLHttpRequest.responseText);
   // Will set that image tag to display the uploaded image.
  },
 });
});

And here’s the Rails controller action, pretty vanilla:

 @image = Image.new(params[:image_form])
 @image.save
 render :text => @image.public_filename

As you can see, all quite straightforward with the help of jQuery. I’ve been using this for the past few weeks with Bloggity, and it’s worked like a champ.

Bill Rails , , , , ,

Me No Blog Hella Ugly!

April 10th, 2009

Welcome to the 2000’s, self!

I’m ever so excited to be blogging at a blog that not only understands code highlighting, but doesn’t look like it was crafted by a mad scientist with cataracts in 1992. Now it looks more like it was crafted by a mad scientist without cataracts circa 2008 — which is an entirely more accurate representation of the truth.

That’s the good news.

The bad news?  That I have don’t anything meaningful to report in this post.

Maybe I’ll just write some highlighted code instead.

# ---------------------------------------------------------------------------
# options[:except_list]: list of symbols that we will exclude form this copy
# options[:dont_overwrite]: if true, all attributes in from_model that aren't #blank? will be preserved
def self.copy_attributes_between_models(from_model, to_model, options = {})
	return unless from_model && to_model
	except_list = options[:except_list] || []
	except_list << :id
	to_model.attributes.each do |attr, val|
		to_model[attr] = from_model[attr] unless except_list.index(attr.to_sym) || (options[:dont_overwrite] &amp;&amp; !to_model[attr].blank?)
	end
	to_model.save if options[:save]
	to_model
end

Hey hey hey code, you’re looking quite sexy this evening — you come around here often?

Bill Rails

Rails Blog Plugin Bloggity v. 0.5 - Now Available for Consumption

April 8th, 2009

Made another pass at incorporating my newer changes to bloggity this evening.  Now in the trunk:

  • FCKEditor used to write blog posts (=WYSIWYG, Wordpress-like text area)
  • Images can be uploaded (via AJAX) while creating a blog post.  You can then link to themvia the aforementioned FCKEditor
  • Added scaffolding for blog categories, and allowing categories to have a “group_id” specified, so you could maintain different sets of blogs on your site (i.e., main blog, CEO blog, user blogs, etc.  Each would draw from categories that had a different group_id)
  • Blog comments can be edited by commenter
  • Blog commenting can be locked
  • Blog comments can be deleted by blog writer

With new features come new dependencies, but most of these are hopefully common enough that you’ll already have them:

  • attachment_fu (if you want to save images)
  • jquery and jquery-form plugin(if you want to upload images via AJAX.  The jquery-form plugin is bundled in the bloggity source code)
  • FCKEditor (if you want to use a WYSIWYG editor)

If you’re already running bloggity, you can update your DB tables by running the migration under /vendor/plugins/bloggity/db/migrations.  If not, you can just follow the instructions in the previous bloggity post and you should be good to go.

I’m hoping in the next week to do some more testing of these new features and add a README to the repository, but it’s too late for such niceties this evening.

P.S.  Allow me to pre-emptively answer why it’s in Google’s SVN instead of Github.

Bill Rails

Rails Fix Slow Loads in Development when Images Missing

February 26th, 2009

I have found it useful to populate my local development database with data from our production server in order to be able to get good test coverage.  However, a perpetual problem I’ve had with this approach is that it introduces an environment where sometimes images are available and sometimes they aren’t (the database knows about all the images, but some were uploaded locally, some reside on our main servers, and some are on S3).

What I’ve found is that even though Rails doesn’t give exceptions when it finds missing images, it does start to get painfully slow.  Each missing image it has to process usually takes about 2 seconds.  On pages with 5-10 missing images, the wait could be quite painful.

So I finally got fed up yesterday and wrote a hacky patch to get around this problem.  Here it is:

def self.force_image_exists(image_location)
 default_image = "/images/dumpster.gif"
 if(image_location &amp;&amp; (image_location.index("http") || File.exists?(RAILS_ROOT +  "/public" + image_location.gsub(/\?.*/, ''))))
  image_location
 else
  default_image
 end
end

This function is part of a utility class (named “UtilityGeneral”) that we use for various miscellaneous tasks.  I call this method from a simple mixin:

if RAILS_ENV == 'development'
 module ActionView 
  module Helpers #:nodoc: 
   module AssetTagHelper
   # replace image tag
   def path_to_image(source)
     original_tag = ImageTag.new(self, @controller, source).public_path
     UtilityGeneral.force_image_exists(original_tag)
    end
   end
  end
 end
end

If anyone else works locally with images that may or may not exist, this wee patch should come in handy to save you from load times of doom on pages that are missing images.  It just subs in an alternate image when the real image doesn’t exist locally.

P.S. When I grow up, I want a blog about coding that lets me paste code.
P.S.S. 4/10: I grew up!

Bill Rails

Monitor Phusion Passenger Memory Usage

February 10th, 2009

We are on the cusp of having Passenger running, but I am paranoid, based on our Mongrel experiences, of Passenger instances leaking memory up the wazoo and eventually exhausting our system resources.  With Mongrel, we’ve used monit to ensure that memory usage remains intact with each Mongrel, but I hadn’t found a straightforward way to do the same with Phusion yet.  So I’m improvising:

kill $(passenger-memory-stats | grep '[56789]..\.. MB.*Rails’ | awk ‘{ print $1 }’)

This single line (run via crontab) ought to do what our thousand line monit config file used to do:  kill off Rails processes that exceed 500 MB.  From my testing so far, it seems to do the trick.

I have verified that it does indeed kill one or multiple Rails processes started by Passenger if their memory usage is reported as being a three digit number that starts with 5-9.  Obviously if a Rails instance were able to jump past the 500-999 MB range in less time than the frequency of our cron task, that would be a problem.

Will report back once I’ve witnessed it at work in the wild.

Update from the wild: Yes, it works.

Bill Rails , , ,

Nginx “24: Too many open files” error with Rails? Here’s why.

February 6th, 2009

We had been racking our brains on this one for a couple weeks.  We have monit looking over our Mongrels, which usually keeps everything on the up and up.  But every so often, our server would go bananas and the nginx error log would flood with the message:

939#0: accept() failed (24: Too many open files) while accepting new connection

Usually the problem automatically resolved itself, but last night it didn’t.  Taking the error at face value, our server guy started looking at the number of open files on our system and the maximum files that could be opened (it’s confusing…  “ulimit -a” reports one limit while “cat/proc/sys/fs/file-max” reports another.  I think that the former might be for actual file system files opened and the latter might be for file handles (which also includes open IP connections and such)).  But even after upping the limit and rebooting repeatedly, the problem persisted.

After server guy (literally) fell asleep on the keyboard around 2 AM, I figured out what had really been happening: any time a new visitor came to our site, we were geocoding their IP with a service that had gone AWOL.  About a week earlier I’d noticed a similar slowdown of about 1-2 seconds with actions that created sessions, but I assumed it was the session creation itself that was causing the slowdown, when in fact it was the geocoding that happened alongside the session creation that was responsible for the lag.

Long story short, when nginx gives this error, what it really seems to mean is that it is holding too many open connections, and usually that is happening because you are using round robin dispatching (bad, I know, but we have our reasons) and one or more of the Mongrels is stuck and forcing the Mongrel queue to skyrocket.

The other lesson here is an obvious one that I’ve read many times before but have been slow to actually act on:  making remote API calls without timeouts is asking for trouble.   Here is a fine article if you’re interested in solving that problem in your own site before it is your ruin.

Bill Rails