05/11/13

Ruby slice to end of an array

It’s popular enough to be a Google-suggested search, but not popular enough to have a good result yet.

If you want to slice to the end of a Ruby array, and/or get the end of a Ruby array, what you want is

arr[1, -1] # -1 means all the rest of the array
02/21/13

Simplest means for AJAX upload with Rails Carrierwave and jQuery

The time has finally come for a follow-up to my post from a couple years ago on using jQuery, attachment_fu, and Rails 2.3 to upload an asset to my blog. I wanted to share the updated version of my attempt to determine the absolute minimal code necessary to implement AJAX uploads on Rails 3 with Carrierwave.  As was the case a few years ago, the Google results still tend to suck when searching for a simple means to accomplish an AJAX upload with Rails — the most popular result I could find this evening was a Stackoverflow post that detailed 9 (ick) steps, including adding a gem to the project and creating new middleware.  No thanks!

The Javascript from my previous example is essentially unchanged.  It uses jQuery and the jQuery-form plugin. The main challenge in getting a AJAX uploading working is that form_for :remote doesn’t understand multipart form submission, so it’s not going to send the file data Rails seeks back with the AJAX request. That’s where the jQuery form plugin comes into play. Following is the Rails code that goes in your html.erb. Remember that in my case I am creating an image that will be associated with a model “BlogPost” that resides in the BlogPostsController. Adapt for your models/controllers accordingly:

<%= form_for(:image_form, :url => {:controller => :blog_posts, :action => :create_asset}, :remote => true, :html => {:method => :post, :id => 'upload_form', :multipart => true}) do |f| %>
 Upload a file: <%= f.file_field :uploaded_data %>
<% end %>

Here’s the associated Javascript:

$('#upload_form input').change(function(){
 $(this).closest('form').ajaxSubmit({
  beforeSubmit: function(a,f,o) {
   o.dataType = 'json';
  },
  complete: function(XMLHttpRequest, textStatus) {
   // XMLHttpRequest.responseText will contain the URL of the uploaded image.
   // Put it in an image element you create, or do with it what you will.
   // For example, if you have an image elemtn with id "my_image", then
   //  $('#my_image').attr('src', XMLHttpRequest.responseText);
   // Will set that image tag to display the uploaded image.
  },
 });
});

Now, chances are you’re uploading this asset from a #new action, which means that the resource (here, the BlogPost) that will be associated with the image has yet to be created. That means we’re going to need a model that we can stick the AJAX-created image in until such time that the main resource has been created. We can do this if we create a migration for a new BlogImage model like so:

def self.up
  create_table :blog_images do |t|
    t.string :image
  end
  add_column :blog_posts, :blog_image_id, :integer # once created, we'll want to reference the BlogImage we created beforehand via AJAX
end

The corresponding BlogImage model would then be:

class BlogImage < ActiveRecord::Base
  mount_uploader :image, BlogImageUploader
end

Of course, if your resource already exists at the time the AJAX upload will happen, then you’re on easy street. In that case, you don’t have to create a separate model like BlogImage, you can just add a column to your main resource (BlogPost) and mount the uploader directly to BlogPost. In either case, the BlogImageUploader class would be setup with whatever options you want, per the Carrierwave documentation.

Continuing under the assumption that you will separate your model from the main resource (in this case, the BlogImage, which is separate from the BlogPost), we can create this image before the BlogPost exists, and stash the BlogPost id however you see fit. Thus, your controller’s #create_asset method will look like:

def create_asset
  blog_image = BlogImage.new
  blog_image.image = params[:image_data][:uploaded_data]
  blog_image.save!
 
  # TODO: store blog_image.id in session OR pass ID back to form for storage in a hidden field
  # OR if your main resource already exists, mount the uploader to it directly and go sip on a 
  # pina colada instead of worrying about this
 
  render :text => blog_image.image.url
end

And that’s it. No new gems, plus low-fat Javascript and controller additions.

Bonus section: How to embed this AJAX upload form in the form for its parent resource

One of the more common questions from my last post was how to display this AJAX image upload amongst the form for another resource. There are many ways to accomplish this (see comments from last post if you’ve got time to kill), but in keeping with the spirit of simplicity in this post, one fast hack I’ve used:

  1. After all the form fields for the main resource, close the form without a submit button
  2. Insert the AJAX form
  3. Add a link below the AJAX form that is styled to look like a button. Have this link call Javascript to submit your main form

Not going to win any beauty contests, but easy to setup and gets the job done.

10/17/12

Rails Exception Handling and Notification with Errbit

Bonanza has travelled a long road when it comes to trying out exception handling solutions. In the dark & early days we went simple with the Exception Notification plugin. The drawbacks of it were many, starting with the spam that it would spew forth when our site went into an error state and we’d end up with thousands of emails. There was also no tracking of exceptions, which made it very difficult to get a sense for which exception was happening how often.

Eventually we moved to HopToad (now Airbrake). It was better, but lacked key functionality like being able to close exceptions en masse or leave comments on an exception.

From there we moved to Exceptional, which we ended up using for the past year.  It was alright, when it worked.  The problem was, for us, it frequently didn’t work.  Most recently, we spent the last week having received two exceptions reported by Exceptional, when New Relic clearly showed that hundreds of exceptions had happened over that time period.  Also damning was the presentation of backtraces, which were hard to read (when present), as well as an error index page that made it difficult to discern what the errors were until they were clicked through.

Enter Errbit.  Jordan found this yesterday as we evaluated what to do about the lack of exceptions we were receiving from Exceptional.  Within a couple hours, he had gotten Errbit setup for us, and suddenly we were treated to hundreds of new exceptions that Exceptional had silently swallowed from our app over the past year.

But it’s not just that Errbit does what it is supposed to — it’s the bells and whistles it does it with.

Specifically, a handful of the features that make Errbit such a great solution:

  • Can set it up to email at intervals (e.g., 1st exception, 100th exception) so you hear about an exception when it first happens, and get reminded about it again later if it continues to be a repeat offender
  • Allows exceptions to be merged (or batch merged) when similar
  • Allows comments by developers on exceptions, and shows those comments from the main index page so you can quickly see if an exception is being worked on without needing to click through to it
  • Easy to read backtrace, plus automagic tie-in to Github, where you can actually click on the backtrace and see the offending code from within Github (holy jeez!)
  • Liberal use of spacing and HTML/CSS to make it much easier to read session, backtrace, etc relative to Exceptional and other solutions we’ve used
  • Open source, so you can add whatever functionality you desire rather than waiting for a third party to get around to it (a fact we’ve already made use of repeatedly in our first two days)
  • Open source, so the price is right (free)

Simply put, if you’re running a medium-to-large Rails app and you’re not using Errbit, you’re probably using the wrong solution.  Detailed installation instructions exist on the project’s Github home.

05/29/12

Fix: IRB pasting is super slow, typing in ruby debugger has lag

After numerous hours spent trying unsuccessfully to fix this problem by following the instructions outlined on a few StackOverflow posts, Jordan presented a recipe for fixing this problem (that actually fixed the problem) today.

The essence of the issue is that the readline package that gets installed with REE is by default some bastard version that lags, at least on our Ubuntu and Mint installations. Installing the rvm readline package did not fix it for either of us, nor did an assortment of experiments on compiling REE with different options. Here’s what did:

$> sudo apt-get remove libreadline6-dev
$> sudo apt-get install libreadline-gplv2-dev
$> rvm remove ree
$> rvm install ree

One problem you may encounter is that if you’re avoiding newer versions of Ubuntu until admit defeat about Unity, the “libreadline-gplv2-dev” package is by default only present in Oneiric and above. Here’s where I found the package versions that worked with Maverick: https://launchpad.net/~dns/+archive/test0/+sourcepub/2252776/+listing-archive-extra. After downloading the packages from this link, the install sequence became

$> sudo apt-get remove libreadline6-dev libreadline5
$> sudo dpkg -i libreadline5_5.2-9~maverick0_amd64.deb
$> sudo dpkg -i libreadline-gplv2-dev_5.2-9~maverick0_amd64.deb
$> rvm remove ree
$> rvm install ree
05/9/12

Create a Custom Thumbnail Processor in Carrierwave

The latest in my of “shouldn’t this be better covered in Google and docs if people really use CarrierWave?”-series. Creating a custom thumbnail processor in Carrierwave is pretty straightforward, but not from any search query I could construct. The gist:

class MyUploader &lt; CarrierWave::Uploader::Base
  version :custom_thumbnail do
    process :some_fancy_processing
  end
 
  def some_fancy_processing
    # Here our context is the CarrierWave::Uploader object, so we have its full
    # assortment of methods at our disposal.  Let's say we want to open an image with
    # openCV and smooth it out.  That would look something like:
    cv_image = OpenCV::CvMat.load *([ self.full_cache_path ]) # See past blog post on the origins of full_cache_path
    cv_image.smooth(5,5)
 
    # The key to telling CW what data this thumb should use is to save our output to
    # the current_path of the Uploader, a la
    cv_image.save_image(self.current_path)
  end
end

Just call #custom_thumbnail.url on your Uploader instance, and you should get the path to the custom result you created.

Using this framework you should be able to get CW to perform whatever sort of custom image processing you want. Thanks to these fellows for helping me decode the #current_path magic here.

05/9/12

Howto: Store a CarrierWave File Locally After Uploading to S3

Have now needed to do this twice, and both times have required about an hour of sifting through Google morass to figure out how to pull this off. The situation we’re assuming here is that you have some CarrierWave file that you’ve stored to a remote location, and you want to get a copy of that file locally to manipulate it. With promising method names like “retrieve_from_cache!” and “cache!” and “move_to_cache” you too may become entangled in the maze of what the hell you’re supposed to be calling to accomplish this. Here’s what.

Step 1: Retrieve from your store (the external place your file exists) to cache (your local machine).

my_cw_uploader.cache_stored_file!

After running that, if you call

my_cw_uploader.cached?

You should get a long string that is the filename for the file that got cached. Note that the cache status of a file is only saved on a per-object basis, so if you try something like

Image.find(1).my_cw_uploader.cache_stored_file!

… you will never again see the object that did the caching, and you will never be able to use that cache. Only call #cache_stored_file! on an object you are keeping around.

Step 2: Go access the file that you cached. Not quite as easy as it sounds. For this, I mixed in the following into my uploader class

module CarrierWave::Uploader::Cache
	def full_cache_path
		"#{::Rails.root}/public/#{cache_dir}/#{cache_name}"
	end
end

So now when you call

my_cw_uploader.full_cache_path

You’ll get the full pathname to the file you downloaded.

Hope this helps save someone else from an hour of Google hell.

07/10/11

Ruby debugger list local variables

Short and sweet. You want to list local variables in ruby-debug? Try

info locals

You can also get ruby-debug to list your stack, instance variables, and much more. Type in plain old “info” into debugger to see a full list of what ruby-debug is willing to reveal to you.

04/22/11

EC2 vs. Heroku vs. Blue Box Group for Rails Hosting

Not to kick a company when it’s (literally) down, but today’s 12+ hours EC2 outage has finally driven me to write a blog post I’ve been holding in my head for several months now: comparing a few of the major players within today’s Rails hosting ecosystem. In this post, I’ll compare and contrast EC2, Heroku, and Blue Box Group. I’ve chosen these three not only because of their popularity, but also because I believe each has a value proposition distinct from the other two, which which makes each ideally suited for different types of customers. In the interests of full disclosure, we are a current Blue Box Group customer, but we have spent a great deal of time looking at our choices, and I think that all have their advantages in specific situations. The facts and opinions below are the result of weighing data gleaned from Googling, popular opinion, and three years running a Rails site that has grown to serve more than 2m uniques per month.

Heroku vs. EC2 vs. Blue Box Group, Round 1: Price

Let’s start by establishing a pricing context. We’ll use a dedicated (not-shared resource/virtual) 8 GB, 8 core server with 1TB bandwidth as our baseline, since it is available across all three services (with some caveats, mentioned below).

Heroku EC2 (High-CPU XL) BBG
Dedicated 8-Core Server with 8 GB $800.00 $490.00 $799.00*
1 TB Bandwidth $0.00 $125.00 $0.00
Total $800.00 $615.00-695.00** $799.00

* In the case of BBG, their most current prices aren’t on their pricing page. They should fix that.

** This doesn’t include I/O costs for Amazon EBS. While these are fairly impossible to predict (varying greatly from app to app), it sounds from Amazon like you’d be talking about something more than $40 for this. Give that we’re comparing a “high end machine” here, perhaps $80 might be a more accurate estimate, that would make the price more like $700.

Various minor approximations were made to try to get this as close to apples-to-apples as possible, but the biggest caveat is that the Heroku instance (Ika) only has about 25% the CPU as the EC2 and BBG instances (though it has the same amount of memory. They don’t configure their DBs with comparable CPU muscle). The next highest Heroku instance (Zilla) is $1600 per month, and more comparable to the other two in terms of CPU, but has twice as much memory as they do. Note that EC2 and BBG make offer discounts when committing to a year of service — I couldn’t find a comparable offer from Heroku, which is not to say that it doesn’t exist (readers?). These discounts typically range from 10-25% off the no-commitment price.

Heroku vs. EC2 vs. Blue Box Group, Round 2: Setup

Heroku is ridiculously get started with, the runaway winner of the bunch when it comes to hitting the ground running with zero red tape. Per their homepage, all you do is run a couple rake commands and you’re in business. Even cooler, they offer a vast and useful collection of add-ons to make it easy to get started on whatever the specific thing is that you app is supposed to do.

Setting up Rails with EC2 is not quite the same walk in the park, but it’s not necessarily bad. Amazon handles configuring the OS for you, so in terms of getting your app server setup, you are essentially just getting Ruby and Rubygems installed, and letting Bundler take care of the rest. If you managed to set up your development environment in Linux or on a Mac, chances are you won’t have too much trouble using packages to fill in the gaps for other non-Ruby elements to your application (like Sphinx). When EC2 gets trickier is when you start figuring out how to integrate EBS (Amazon Elastic Block storage, necessary for data that you don’t want to disappear) and the other 20 Amazon web services that you may or may not want/need to use to run your app. It can ultimately amount to quite a research project to figure out which tools you want, which ones you need, and how to tie them all together. That said, you may end up using these tools (S3 in particular) even if you use BBG or Heroku, so that cost is not entirely unique to using EC2.

Ease of getting started on Blue Box is somewhere in between EC2 and Heroku. There is no high-tech set of tools that automatically build stuff like Heroku, but unlike EC2, you have a friendly and qualified team willing to help you get your server setup in the best possible way. In my experience, when they have setup new servers, they will ask in advance how we plan to use the server, and then automatically handle getting all of the baseline stuff we’ll need installed such that we can just focus on deploying our app. Which brings me to Round 3…

Heroku vs. EC2 vs. Blue Box Group, Round 3: What’s Best Suited for What?

For pet projects, small sites, or newly started sites, I think that hosting with Heroku is a no-brainer. You can be up and running immediately, you get a huge variety of conveniences with their add-ons, and there is a wealth of Google help behind you if you should happen to encounter an trouble, given the immense user base Heroku has managed to establish. All three of these these services can scale up available hardware within hours/minutes-not-days (yay for clouds!), but Heroku is probably the most straightforward to rapidly grow an application with their “Dynos” approach. However, given their highest cost amongst the choices, and their lower-than-BBG application-specific support, the significance of those advantages will erode as your application grows into the 10′s of thousands of monthly visitors.

I believe that EC2′s greatest selling point is its price, with its scalability and ubiquity (= generally good Googlability) being close seconds. As detailed above, on balance, EC2 tends to run 20-30% cheaper than other choices by leveraging their immense scale. Nifty features like auto-scale have the promise of making instant growth possible if you get flooded after your Oprah appearance. The trade-off for those advantages is that you will get 0 application-specific support, and even getting generic system-wide support can be hit-and-miss, as folks who suffered through today’s EC2 outage learned firsthand. Transparency is not Amazon’s strong suit at this point in their evolution, which can be a real problem if you have real customers who depend on your product and want to know when they can expect to see your website again during an outage. Also, as mentioned in the setup section, figuring out your way around the Amazon product ecosystem can be dauting at first.

I would consider EC2 the best choice for intermediate-sized businesses, particularly if 100% uptime is not imperative to their existence. EC2 is a great option for bootstrapped startups who want to get online as cheaply as possible, and are willing to put in the extra work setting up their servers in exchange for those cost savings. Also, since you will probably be unclear about what kind of resources your app is going to consume as it scales, EC2 is a great proving ground to get a sense for what kind of resources you might need if you decide to venture beyond Amazon for improved reliability and service. I would also take a long look at EC2 for huge businesses that can afford their own IT department, which diminishes the significance of EC2′s lack of application-specific support or monitoring.

While their prices are competitive with EC2, I would assert that the real differentiator with Blue Box is their focus on service. Included by default for business-sized BBG customers is 24/7 pro-active human monitoring of all services, including the ability to bring servers back online if they should happen to crash and you’re not around. Having gone through a fair number of web hosts in our day, we we have come to realize that, once you are signed up at a given host, it can be a huge pain to change. Most hosts use this knowledge to their advantage, and after a very romantic honeymoon period, become inattentive to their customer’s needs after it becomes clear the customer would be hard-pressed to move.

At Blue Box, their customer-focused attitude has not diminished a bit over time. We still regularly find them answering our questions…on the weekend…within minutes of the question being sent. Equally important, Jesse Proudman (BBG CEO) has built a team around him that gives the customer the benefit of the doubt. In more than a year of being hosted at BBG, I can not ever remember them “blaming” us for server changes we’ve made that have caused havoc (not the case at some of our past hosts). Instead, BBG has a solution-focused team that is consistently personable, reasonable, and most importantly, effective when it comes to solving tricky application and server problems.

While BBG offers small VPS instances, as well as cloud servers that can quickly scale, I consider their sweet spot to be businesses that have grown beyond the point of being able to easily maintain their server cluster themselves, but they don’t want to hire an on-staff IT guy. Or maybe they do have an IT guy, but they really need two. Over the past couple years, BBG has been our “IT guy,” working to implement systems for us ranging from a Mediawiki server, to load balancers, to Mysql master-master failover clusters. And compared to having a real IT guy on staff, the price is a huge bargain (not to mention the savings on health insurance, taxes, etc.)

Another nice benefit for those that have been stung by EC2/Heroku uptime hiccups: in 20 months with BBG, our total system downtime has been something between 1-2 hours (excluding downtime caused by our mistakes).

Conclusion

The best host for a particular Rails app depends on a number of factors, including phase of development (pre-launch? newly launched? rapidly growing? already huge?), need for 100% uptime, makeup of team, and cash available. Hopefully this post will be helpful to someone trying to figure out which host makes most sense for their unique situation. Please do post any questions or anecdotes from your hosting experience for future Google visitors.

Update: Response from BBG

Blue Box emailed me after this post, with a few extra details that I believe are pertinent:

  • 4 x 300GB 15k SAS outperforms EBS (which both Heroku and Amazon rely on) by almost 30% based on our client benchmarks.
  • Neither Heroku nor Amazon provide 24 x 7 monitoring with proactive human response. This can be is a *key* differentiator, particularly when comparing costs.
  • All of our dedicated database instances are running bare metal, meaning you gain consistent and reliable performance, and aren’t subject to similar massive outages caused by the muli-tenancy of a SAN.

If anyone from Amazon or Heroku would like to provide extra details of what makes them a strong choice, I’d be only too happy to post those as well.

04/10/11

Nokogiri Recursively Find Children/Child from a Parent

Contrary to the pages of complex hand-written recursive methods I found on StackOverflow when Googling this, it is actually as simple as

  noko = Nokogiri::XML("my_noko_file.xml")
  parent_node = noko.root.xpath("//MyNodeName")
  children_named_floyd = parent_node.xpath(".//Floyd")

If you want to search on more complex criteria, you can also add in extra sauce to your xpath.

  noko = Nokogiri::XML("my_noko_file.xml")
 # Searches your entire XML tree for an XML node type "MyNodeName" that has an attribute "id" set to a value of '1234'
 # Then grabs the XML node of type "Something" from within the found NodeSet
  parent_node = noko.root.xpath("//MyNodeName[@id='1234']").at("Something")
  # Grab all children of the "Something" node that are of type "Floyd"
  children_named_floyd = parent_node.xpath(".//Floyd")

Nokogiri is a great gem. But I do often wish it’s docs had more examples and less byzantine explanations for common operations like these. But in the meantime, let’s hope Google will continue to fill in the gaps.

03/19/11

Rails 3 Slave Databases: Compare Octopus to SDP

In the crazy wild days of Rails 2.x…

In the pre-Rails 3 ecosystem, there were a number of confusingly similar choices for getting master/slave database functionality established. These options included Masochism, DB Charmer, master_slave_adapter, and seamless_database_pool, amongst others. When it came time from Bonanza to make its choice on which slave plugin to use, I made my best effort to assess the velocity and functionality of each of the prominent slave database solutions, and wrote what went on to become a fairly popular post comparing the relative strengths of each choice.

Octopus

Fast forward to Rails 3, and the field has narrowed considerably. Most all of the top Google results for Rails slave database options these days point to Octopus, and with good reason. Its documentation is sound, and its github project has maintained good velocity for the better part of the past year. Reading between the lines of the Octopus documentation, it would seem that it was built first and foremost as a tool to make it stupidly easy to shard databases; secondarily, it also supports using slave databases in a non-sharding format, but the implementation here gets a little more sketchy, as the examples show users needing to explicitly declare a given slave database for a particular query. In the documentation, this is done at query time, e.g.,

User.where(:name => "Thiago").limit(3).using(:slave_one)

or

Octopus.using(:slave_two) do
  User.create(:name => "Mike")
end

Seamless Database Pool

Upon learning about octopus, my natural inclination was to compare it to our current solution, seamless_database_pool. Admittedly, when we got to the Rails 3 party, SDP was running a bit behind. The author had been kind enough to do much of the legwork to get it compliant with AR3, but we still encountered errors actually trying to use the plugin within controllers and views the way we had been able to with the previous version.

So I fixed it.

What Seamless Database Pool now represents is a slave database plugin that is specifically built with the purpose of making it as easy as possible to A) connect to one or more weighted slave databases B) declare whether a particular Rails action should attempt to use slaves, masters or both (automatically defaulting to the master when write operations occur) and C) gracefully handle failover if one or more of the slave databases declared should become unavailable for whatever reason.

SDP does not have any built in support for sharding, so if that is what your DB needs, Octopus is your best bet. But if what you need is specifically a Rails 3 supported solution that will allow you to connect mix and match your main database and N number of slaves, in a weighted way and with failover automatically baked in, this is where seamless_database_pool really shines.

Bonanza has been using SDP in production for more than a year now, and in the meantime have experience failures of our slave database every few months, which at one point what have brought down the entire site. Now, within seconds, Rails figure out that it needs to re-route requests and finds a database it can use that is still available. The still-good SDP documenation describes how to make it happen.

Bottom line

Prior to writing this blog, if you Google master/slave database you would probably come away thinking there was only one solution, and that solution was only secondarily focused on allowing N slaves to be configured. I may be wrong about the level of support that Octopus already had for setting up multiple weighted failover slaves (and being able to declare usage of these on a per-action vs. per-query basis), but the documentation makes me think that this is at best a future roadmap feature. In the meantime, if it’s specifically database support you need, try the drag-and-droppable SDP gem. I will continue linking my fork of the project until the original author decides what he wants to do with my pull request (which fixes fundamental issues with Rails 3 controller integration, plus adds more robust slave failover).

Installation

Is as easy as possible. In your bundler Gemfile:

gem “seamless_database_pool”, :git => “git://github.com/wbharding/seamless_database_pool.git”

Your database.yml file will then look something like:

production:
  adapter: seamless_database_pool
  port: 3306
  username: app_user
  password: app_pass
  pool_adapter: mysql
  master:
    host: 1.2.3.4
    pool_weight: 0 # 0 means we only use master for writes if the controller action has been setup to use slaves
  read_pool:
    - host: 2.3.4.5
      username: slave_login
      password: slave_pass

Do drop a line in the comments with any questions or feedback if you have experience with either SDP or Octopus as solutions for Rails slave database support!