10/20/08

Bloggity – Rails Blog Plugin Made Stupid Simple

Last updated June 8th 2009: Bloggity version 1.0 has landed! This blog has been completely revised for the occasion.

With a spec (and funding) courtesy of Goldstar, I finally made time to incorporate many of the features I’d been hoping to add for the last year. Here is the current featureset:

  • Rails 2.3 compatible (also should be compatible with Rails 2.0-2.2, if you use the Engines plugin and copy the routes into your routes file)
  • Drop-in installation, including rake tasks to automate creating the necessary database tables, and moving the necessary assets into your existing project.
  • WYSIWYG editor for writing blogs
  • Support for AJAX uploading of images and/or assets of any kind with your blog post
  • URLs formatted using blog titles for good SEO (e.g., http://mysite.com/blog/founder/welcome_to_my_blog)
  • Create and maintain one or more blogs in a Rails app (e.g., a main blog, CEO blog, and user blogs)
  • User commenting, built in support for gravatars
  • Separate, configurable persmissions for writing, commenting on, and moderating comments for blogs
  • RSS 2.0 feeds (with built in customizability for Feedburner)
  • Ability to use custom stylesheets on a per-blog basis (i.e., the Founder’s blog can use different styling than the user blogs)
  • Ability to tag blog posts, and filter by tag
  • Ability to categorize blog posts
  • Blogs can be saved in draft mode before publishing
  • Model & controller test suite
  • Docs and an active git project!

Requirements, Dependencies

Bloggity 1.0 has been tested with Rails 2.3. In Rails 2.3, it shouldn’t need any plugins besides will_paginate. It should also work in Rails 2.0-2.2, but you’ll need to copy the routes into your routes file, since routes were extracted from plugins in 2.3. Other monkeying around might also be needed, if you have issues installing on pre-2.3 Rails, please drop a line in the comments.

FCKEditor is used for the WYSIWYG editor. This is included with the plugin, and shouldn’t affect your existing Rails install (plugin puts its assets in a separate directory). For the image uploading, jQuery is used (since it works via AJAX). This is also included with the plugin, and doesn’t need to be used elsewhere on your site.

Installation Instructions

  1. Run “script/plugin install git@github.com:wbharding/bloggity.git” to grab Bloggity and put it in vendor/plugins/bloggity
  2. Run “rake bloggity:bootstrap_db” from your application’s base directory
  3. Run “rake bloggity:bootstrap_bloggity_assets” to copy the images and stylesheets used with bloggity into your apps /public/stylesheets/bloggity and /public/images/bloggity directories.
  4. Run “rake bloggity:bootstrap_third_party_assets” to copy FCKEditor and jQuery into your “/public/javascripts/bloggity” directory. This is not required, but the WYSIWYG editor and asset uploading won’t work without it.
  5. Take a gander at bloggity/lib/bloggity/user_init.rb. This shows the methods that bloggity will use to interface with your user model (for blog permissions and such). Implement these in a sensible way in your user model. Don’t forget to copy the blog associations into your user. (If you prefer, you could also copy the UserInit into a lib directory of yours, fill in the methods, and include it in your User model directly)
  6. There are two methods in bloggity/lib/bloggity/bloggity_application.rb: current_user and login_required. Implement these in your application_controller.rb if you haven’t already.

And you’re done!

In the Wild

An earlier version of Bloggity can be seen in the wild at the Bonanzle blog. Bloggity 1.0, with its support for tagging, categorization, and gravatars, actually exceeds the features you’ll see at Bonanzle.

Screenshotsbloggity_1

Here are a couple shots of Bloggity in action with its default styles.

Looking at the main blog.

Writing a blog.

Uploading assets for a blog.

Miscellaneous

I haven’t built in support yet for upgrading from pre-1.0 versions of Bloggity, because, well, just about everything changed with the release of 1.0, not the least of which is the names for several models. However, when Bonanzle moves from Bloggity 0.5 to 1.0, I’ll post how we went about it here.

Please add your bug reports to the comments section and I’ll commit changes as you find them.

10/12/08

Rails Fast(er) Clone/Copy of attachment_fu Images

We regularly copy images around on our site — for cropping, for duplicating items, and for many other purposes. For the last six months, we’ve been using the code that I found on Ruby forum. It has been passable, but as I’ve been stamping out the bottlenecks in our performance, I’ve found that image duplication has been one of our slower movers. Copying a single image with its two thumbnails was taking about 5-10 seconds per image. Given that all it should be doing, functionally, is making a file copy and a new database entry, this didn’t seem right. I did some research into it today and figured out why.

The reason is that the code on Ruby forum still relies on the main image creating its thumbnails from scratch. That is, the main loop of thumbnail creation in their method isn’t actually saving the thumbnails. The thumbnails get saved on the c.save line, via the standard attachment_fu after_save callback.

I just finished an updated monkey patch that I’m sharing here that should let you copy images without the costly thumbnail re-processing. You can grab attachment_fu mixin.rb or copy the ugly WordPress version below.


module Technoweenie # :nodoc:
module AttachmentFu # :nodoc:
module InstanceMethods
attr_writer :skip_thumbnail_processing

# Added by WBH from http://groups.google.com/group/rubyonrails-talk/browse_thread/thread/e55260596398bdb6/4f75166df026672b?lnk=gst&q=attachment_fu+copy&rnum=3#4f75166df026672b
# This is intended to let us make copies of attachment_fu objects
# Note: This makes copies of the main image AND each of its prescribed thumbnails

def create_clone
c = self.clone

self.thumbnails.each { |t|
n = t.clone
img = t.create_temp_file
n.temp_path = img #img.path -- Commented so that img wo'nt get garbage collected before c is saved, see the thread, above.
n.save_to_storage
c.thumbnails<<n
}

img = self.create_temp_file

# We'll skip processing (resizing, etc.) the thumbnails, unless the thumbnails array was empty. If the array is empty,
# it's possible that the thumbnails for the main image might have been deleted or otherwise messed up, so we want to regenerate
# them from the main image
c.skip_thumbnail_processing = !self.thumbnails.empty?
c.temp_path = img #img.path
c.save_to_storage

c.save
return c
end

protected

# Cleans up after processing. Thumbnails are created, the attachment is stored to the backend, and the temp_paths are cleared.
def after_process_attachment
if @saved_attachment
if respond_to?(:process_attachment_with_processing) && thumbnailable? && !@skip_thumbnail_processing && !attachment_options[:thumbnails].blank? && parent_id.nil?
temp_file = temp_path || create_temp_file
attachment_options[:thumbnails].each { |suffix, size| create_or_update_thumbnail(temp_file, suffix, *size) }
end
save_to_storage
@temp_paths.clear
@saved_attachment = nil
callback :after_attachment_saved
end
end

Include it in your environment.rb and you should be golden.

Notice that it is nearly identical to the Ruby forum code, except that it adds a new member variable to the object being copied so it will just save, not re-process, the thumbnails.

I’ve tested it on my dev machine for a few hours this afternoon and it all seems working, I’ll post back later if I encounter any problems with it. You can do the ame.

10/8/08

How Much Do We Want A Decent Version of Git for Windows? T-H-I-S M-U-C-H.

We want it bad. Reeeeeal bad.

Bonanzle is still running on Subversion (via Tortoise), because comparing its UI to the UI for the de facto git standard (msysgit) is like comparing Rails to ASP on Visual Basic. Yes, the difference is that big. Msysgit is ugly as Pauly Shore, most of its windows can’t be resized, it crashes regularly, and trying to decipher the intent of its UI is like reading dead sea scrolls.

Yes, yes, I know: if you don’t like a piece of open source software, you should shut up and fix it. Unfortunately, I’m sort of single-handedly maintaining this web site that has grown to about 150k uniques this month, where two months ago we had about 10k uniques. I can not end world hunger and rescue my cat stuck in a tree.

But if there is any intelligent life out there that has spare programming cycles and the desire to make a huge difference in the world, this is my personal plea that they will consider giving some love to the woebegone msysgit… or maybe just start their own Windows git client, I can’t imagine a real hacker would take more than a week or two to match the featureset of the existing msysgit.

I’d really like to move Savage Beast to github, and I’d really like to collaborate on the other projects that are happening on there, but it just doesn’t make sense to go from a slick, error free, decipherable UI like Tortoise to the meager helpings of msysgit.

I’d happily donate to a project like better Windows git.

Preemptive note to smart alecks: No, I’m not moving to Mac (or Linux) now. There are plenty of reasons why, I’ll tell you all about it some other time. Incidentally, what is the preeminent GUI for git on Mac these days? From what I understand, many of the real hackers are perfectly content using git from the command line…? Shudder to think of reading the diffs and histories.

10/3/08

Rails Thinking Sphinx Plugin: Full Text Searching that’s Cooler than a Polar Bear’s Toenails

Continuing my series of reviews of the plugins and products that have made Bonanzle great, today I’ll talk about Thinking Sphinx: How we’ve used it, what it’s done, and why it’s a dandy.

What It Is Bro, What It Is

What it is is a full text search Rails plugin that uses the Sphinx search engine to allow you to search big tables for data that would take a long-assed time (and a lot of custom application code) to find if you used MySql full text searching.

What Are Your Other Options

In the space of legitimate Rails full text plugins, the commonly mentioned choices are Sphinx (via Thinking Sphinx or Ultra Sphinx), Xapian (via acts_as_xapian), solr (via acts_as_solr) and (shudder) Ferret (via acts_as_ferret).

Jim Mulholland does a great job of covering the various choices at a glance, so if you’d like a good overview, starts with his blog about the choices.

To his commentary, I would add that Solr looks complicated to get running, appears to have been abandoned by its creator, and hasn’t been updated in quite awhile. It should also be mentioned that if you were to choose solr, every time you wished to talk about it online you’d have the burdensome task of backspacing the “a” out of the name that your fingers were intent on typing in.

Xapian seems alright, but the documentation on it seemed lacking and not a little arcane. Despite Jim’s post on how to use it, it seemed like the Xapian Rails community was pretty sparse. My impression was that if it didn’t “just work,” it would be I alone who would have to figure out why. Also, from what I can tell in Jim’s post, it sounds like one has to stop Xapian from serving search requests to run the index? Update: the FUD patrol informs me that you can concurrently index and serve, oh what joy!

Ferret sucks. We tried it in our early days. It caused mysterious indexing exceptions left and right, whenever we changed our models or migrated. The day we expunged it from our system was the day I started programming our site and stopped worrying about what broke Ferret today.

Ultra Sphinx looks OK, but as you can read here, it’s ease of use leaves something to be desired compared to the star of our blog post, who has now entered the building. Ladies and gentlemen, may I present to you, hailing from Australia and weighing in at many thousand lines of code:

Thinking Sphinx!

There’s a lot to like about Thinking Sphinx: it has easy to read docs with examples, it has an extremely active Google Group behind it, and it supports useful features like location-based searches and delta indexing (e.g., search updates in real time).

But if there is one reason that I would recommend Thinking Sphinx above your other choices, it’s that you probably don’t care a hell of a lot about full text searching. Because I didn’t. I care about writing my website. This is where Thinking Sphinx really shines. With the tutorials and Railscasts that exist for Thinking Sphinx, you can write an index for your model and actually be serving results from Thinking Sphinx within a couple hours time. That doesn’t mean its an oversimplified app though. Its feature list is long (most of the features we don’t yet use), but smarts defaults are assumed, and its super easy to get rolling with a basic setup, allowing you to hone the parameters of the search as your situation dictates.

Also extremely important in choosing a full text search system is reliability. Programming a full text engine (and its interface into your application) is rocket science, as far as I’m concerned. I don’t want to spend my time interpreting esoteric error messages from my full text search engine. It must work. Consistently. All the time. Without me knowing anything about it. Thinking Sphinx has done just that for us. In more than a month since we started using it, it’s been a solid, reliable champ.

A final, if somewhat lesser, consideration in my recommendation of TS is who you’ll be dealing with if something goes wrong. Being open source, my usual expectation is that if me and Google can’t solve a particular problem, it will be a long wait for a response from a random, ever-so-slightly-more-experienced-than-me user of the system in question who will hopefully, eventually answer my question in a forum. Thinking Sphinxes creator Pat Allen blows away this expectation by tirelessly answering almost all questions on Thinking Sphinx in its Google Group. From what I can tell, he does this practically every night. This is a man possessed. I don’t claim to know or understand what’s in the punch he’s drinking (probably not beginner’s enthusiasm, since TS has been around for quite some time now), but whatever’s driving him, I would recommend you take advantage of his expertise soon — before he becomes jaded and sour like the rest of us.

What About the Performance and Results?

Performance: great. Our usual TS query is returned in a fraction of a second from a table of more than 200,000 rows indexed on numerous attributes. Indexing the table currently takes about 1-2 minutes and doesn’t lock the database. Nevertheless, we recently moved our indexing to a remote server, since it did bog down the system somewhat to have it constantly running. I plan to describe in the next couple days how we got the remote indexing working, but suffice to say, it wasn’t very hard (especially with Pat’s guidance on the Google Group).

Results: fine. I don’t know what the pertinent metrics are here, but you can use weighting for your results and search on any number of criteria. Our users are happy enough with the search results they’re getting with TS out of the box, and when we do go to get more customized with our search weighting, I have little doubt that TS will be up to the task, and it’ll probably be easy to setup.

Final Analysis

If you want to do full text searching on a Rails model, do yourself a favor and join the bandwagon enjoying Thinking Sphinx. It’s one of the best written and supported plugin/systems I’ve stumbled across so far in the creation of Bonanzle.

I’m Bill Harding, and I approved of this message.

10/1/08

MySql Use “Distinct” and “Order by” with Multiple Columns AKA Apply “Order by” before “Group”

I’ve had a devil of a time trying to get Google to tell me how to write a Mysql query that allows us to somehow perform a MySql query that 1) filters rows on a distinct column 2) returns other columns in the query besides the distinct column and 3) allows us to order by a column. In our case, we (and you, if you’re running Savage Beast!) have a list of most recent forum posts on the site — currently, if you list all recent posts, the search will just find all posts and order by date of creation, but this makes for some dumb-looking output since you often end up with a list where 10 of the 20 posts are all from the same forum topic. All the user really wants to know is what topics have a new post in them, and to get a brief glimpse as to what that new post might be.

Thus, we want to create a query that returns the new posts, ordered by date of creation, that have a distinct topic_id.

Here’s the SQL that can make it happen:

Post.find_by_sql(“select posts.* from posts LEFT JOIN posts t2 on posts.topic_id=t2.topic_id AND posts.created_at < t2.created_at WHERE t2.topic_id IS NULL ORDER BY posts.created_at DESC”)

Hope that Google sees fit to lead other people here instead of struggling to get GROUP BY to order results beforehand (GROUP BY ‘posts.topic_id’ works, but it returns the first post in each distinct topic, rather than the last post as we desire), or get SELECT DISTINCT to return more than one column, as many forum posters unhelpfully suggested in all the results I was getting.

Update 11/26/08 – A Word of Caution

I finally got around to setting up some profiling for our site yesterday and was surprised to discover that the above query was taking longer per execution than almost anything else on our entire site. The SQL Explain was not too helpful to explain why, but it showed three joins, the join on the topics table involving every row of the table (which is presently almost 10,000).

Takeaway: for this query to work, it seems to consider every distinct topic in the table, rather than being smart and stopping when it hits the per-page paginated limit. Since I already determined that “group” and “distinct” were non-starters for being able to pick the newest post in each topic, I ended up revising the way the logic was done to an easier to manage and far-more-DB-efficient way:

We now track in each topic the newest post_id within that topic. While this adds a bit of overhead to keeping the topic updated when new posts are made to it, it allows us to do a far simpler query where we just select the more recent topics, joining to the most recent post in each topic, and then ordered by the age of those posts.

If you have the ability to create an analogous situation to solve your problem, your database will thank you for it. The above query starts getting extremely slow with more than a few thousand rows. Yet, I defy you to find an alternative to it that works at all using “group” or “distinct.”