Adventures into CouchDB and Rails

September 4, 2009

Getting ready for CouchDB 0.10

Filed under: CouchDB — zdzolton @ 2:59 am

I’ve setup a local copy of CouchDB, from the 0.10 branch, just to see if my application code could handle its awesome powers. Here are my two big takeaways:

Turn on delayed commits

CouchDB 0.10 now defaults to disabling delayed commits, which in your production environment is a great trade of data consistency for write speed. However ensuring all those bits get flushed out to disk can be quite slow. For example, before turning on delayed commits my slowest unit test took over 12 seconds (ouch!), but after it got down to 3.4 seconds. Moreover, the whole test suite went from over 400 seconds (eek!) down to 90 seconds, after turning that option on.

So make sure to put this into your local.ini for your workstations:

[couchdb]
delayed_commits = true

Review your usage of reduce=false

In CouchDB 0.9 one could always specify reduce=false in view queries—even for views that don’t actually define a reduce function. This will now be considered an invalid query, and CouchDB will respond a 400 HTTP status code. I’m not sure I think it necessary to be so strict here, but all you do is review your database queries to make sure you’re passing reduce=false in erroneously.

I think we’re ready!

Other than this, it took very little to prepare my application for CouchDB 0.10—and I’m pretty excited for the _changes API and no downtime deployment of views!

June 9, 2009

How to Fill up a CouchDB with Twitter Statuses

Filed under: Uncategorized — zdzolton @ 2:08 pm

Last night I hacked together a bit of Ruby code to fill a CouchDB database with Twitter statuses. I’ll need to make some adjustments to deal with Twitter’s timeout and throttling, however…

Enjoy!

May 4, 2009

Quick Tip: CouchDB with Lucene Search

Filed under: CouchDB — Tags: , , , — zdzolton @ 4:13 pm

So, I’m still loving Bob Newson’s CouchDB-Lucene external integration.

Here’s a quick way to monkey-patch on a search method:

CouchRest::Database.class_eval do 
  def search query, options={} 
    CouchRest.get "#{@uri}/_fti?#{options.merge(:q => query).to_query}" 
  end 
end 

When Upgrading CouchDB at Some Point…

Filed under: CouchDB — Tags: , , — zdzolton @ 3:15 pm

Versioning with Mac Ports can kinda suck, and new CouchDB versions are often incompatible with the old version’s data files.

Idea: Use CouchDBX, during incompatible upgrades, as a holding pen for your local data!

  1. Download CouchDBX
  2. Replicate from your Mac Ports-installed databases to
  3. Delete “old” databases from your Mac Ports-installed CouchDB
  4. Upgrade your Mac Ports-installed CouchDB to some new, binary-incompatible version
  5. Re-create the databases in the newly-upgraded CouchDB installation
  6. Replicate from your CouchDBX databases to the Mac Ports-installed CouchDB
  7. Enjoy! Or, Profit! (You choose…)

See?! That was so bad, was it?

May 1, 2009

Computational Evangelism

Filed under: CouchDB — zdzolton @ 4:17 pm

Using CouchDB, or any relatively new open source software, requires much work and dedication.
Here I present my list of my favorite reasons to use CouchDB:

It’s Made of the Web

  • All database operations are through REST verbs—a browser or CURL are all you need to get started!
  • Etags mean documents and views are ready for caching
  • JSON representation of all data means no-brainer mapping to programming language constructs
  • JavaScript for all scripting duties means no context swaps

Map-Reduce Indexing

  • Map-reduce functions are side-effect free, and easy to reason about using imperative OR functional techniques
  • View indexes execute explicitly, whereas SQL gurus use voodoo to change queries, hoping that the query planner decides to use the correct index
  • Functional JavaScript programming techniques can succinctly express map-reduce logic
  • It’s the algorithm that powers Google!

The Meek Shall Inherit the Earth(‘s Data)

  • Schema-less database removes impediments to change
  • Flat key-value storage is easy to reason about
  • UUIDs, instead of sequence IDs, means any two databases can replicate documents
  • Seriously easy replication: Push/Pull == POST/GET
  • AJAX-only CouchApps + Easy replication == open source apps + viral databases

Of course, swapping your SQL brain out, for a Map-Reduce one, takes a bit of time. In my opinion, however, CouchDB’s feature set, maps better to the requirements for many up-and-coming websites.

March 13, 2009

Behavioral Patterns

Filed under: CouchDB — Tags: , , — zdzolton @ 12:11 am

I’ve just read Sebastian Bergmann’s explanation of Objection-Relational Behavioral patterns, where he questions whether they are still useful for CouchDB. Hmm… Maybe I’m not getting something, but I feel these Object-Relational patterns are still a good match for CouchDB.

In particular, Unit of Work would be a great candidate, since the CouchDB doesn’t provide any transactional guarantees, not to mention that a CouchDB application is responsible for dealing with conflicts using domain-specific logic.

Lazy-loading “has_many” child objects seems to me necessary for performance, given that retrieving an entire object graph often takes more than one query for CouchDB anyways. I’d like to see CouchSurfer take it’s cue from how DataMapper does it.

Finally, implementing an Identity Map should be dead-simple given CouchDB’s flat ID space, with minor complexity of keeping the document GETs, by ID or from a view query, in sync. Moreover, I think it could probably be done at the level of CouchRest, so that all persistence libraries built atop it (and yes, there are already many) can reap the benefits.

I think the real problem will be getting people to stop using the word “Relational” —since we’re not talking about RDBMS’s here!

February 6, 2009

Rails Time Zone Bug — An Edge Case

Filed under: Uncategorized — zdzolton @ 1:02 am

Premise

I have found a Mac OS X-specific edge case in the Rails time zone support.

Evidence

irb(main):001:0> require 'rubygems'; require 'activesupport'
=> true
irb(main):002:0> ENV['TZ'] = 'US/Central'
=> "US/Central"
irb(main):003:0> t = Time.now
=> Thu Feb 05 16:03:56 -0600 2009
irb(main):004:0> t.to_s :json
=> "Thu Feb 05 16:03:56 -0600 2009"
irb(main):005:0> t.to_s :rfc822
=> "Thu, 05 Feb 2009 16:03:56 -0500"

Notice that when invoking Time#to_s method, and providing the parameter :rfc822, the UTC

Solution

A small modification to the :rfc822 format specifier in DATE_FORMATS hash, defined within the ActiveSupport::CoreExtensions::Time::Conversions module, does the trick.

Specifically, line #13 of activesupport/lib/active_support/core_ext/time/conversions.rb:

          :rfc822       => "%a, %d %b %Y %H:%M:%S %z"

Should be changed to this:

          :rfc822       => lambda { |time| time.strftime("%a, %d %b %Y %H:%M:%S #{time.formatted_offset(false)}") } 

Background

This is related to my previous blog post, in which I found a similar bug in the serialization of CouchSurfer::Model timestamps.

I am currently testing this on Mac OS X 10.5.6.

February 4, 2009

CouchSurfer – Timezone Problems

Filed under: Uncategorized — zdzolton @ 3:37 pm

Problem Definition

There seems to be a timezone-related bug for the CouchSurfer::Model::ClassMethods.timestamps! method.

Symptoms

Here is the failing RSpec example:

1)
'CouchSurfer::Model a model with timestamps should set the time on create' FAILED
expected: < 2,
     got:   3600.004118
./spec/lib/model_spec.rb:532:

Finished in 13.386698 seconds

107 examples, 1 failure

Root Cause

The problem seems to lie in line #158:

self['updated_at'] = time.strftime("%Y/%m/%d %H:%M:%S.#{time.usec} %z")

In particular, the %z used in the format specifier, performs unreliably across operating systems.

Evidence

When executing this on an Ubuntu server to which I have access:

$ irb
irb(main):001:0> ENV['TZ'] = 'US/Central'
=> "US/Central"
irb(main):002:0> Time.now
=> Wed Feb 04 09:12:34 -0600 2009
irb(main):003:0> Time.now.strftime("%z")
=> "-0600"
Now, when I execute this code on my local Mac OS X 10.5.6 machine:
$ irb
irb(main):001:0> ENV['TZ'] = 'US/Central'
=> "US/Central"
irb(main):002:0> Time.now
=> Wed Feb 04 09:15:19 -0600 2009
irb(main):003:0> Time.now.strftime("%z")
=> "-0500"

Notice how the %z resulted in “-0500” on Mac OS X, instead of “-0600” as it should have.

Solution

I added a helper method to CouchSurfer::Model, named format_utc_offset, that returns a the correct UTC offset string. This was basically adapted from the ActiveSupport Time#formatted_offset code in Rails.

You can see the changes made in my fork.

January 31, 2009

The Couch Underground, Part I

Filed under: Uncategorized — Tags: , , , — zdzolton @ 6:02 am

I aim to use CouchDB, compatibly with Rails; I’ll take any suggestion…

Should you have any, just tweet me.

January 29, 2009

Hello Again, World!

Filed under: Uncategorized — zdzolton @ 4:56 am

I’m here to discuss my adventures into the realm of CouchDB and Rails…

Let’s have fun!

Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.