Chris Wanstrath (a.k.a. defunkt) just wrote an article on the github blog about how he cut the time to deploy github itself from fifteen minutes to fourteen seconds.
The starting point was the observation that since the standard Capistrano deployment tasks treat the code repository as a black box plugin, they aren't optimized. They treat git repositories pretty much the same as they do subversion repos.
In a search for a better way, Chris takes on a tour of the various Capistrano alternatives, Vlad the Deployer, Heroku's Rush, and finally Fabric a deployment framework written in Python, before coming full circle back to Capistrano and refactoring the deploy recipes, then rewriting the tasks to setup, update and rollback the code on the server using more "gitty" techniques.
Another thing which slowed the old deploy down was having a separate cap task to make each symlink needed on the server. Each cap task has some overhead, which Chris eliminated by making a single task which made all of the symlinks
The last change was moving the task of minimizing JavaScript and CSS from the machine running cap, where it was repeated for each server, to the servers themselves.
This is a great article, with lots of food for thought on how to use Cap and Git.
Yesterday, I wrote about the crash project to resurrect this blog after a hardware failure. I'm now running it on an upgraded version of Typo (5.2 instead of 4.1), using Passenger on Ubuntu 8.10 (instead of a cobbled up stack of Apache, Pen, and Mongrel::Cluster on Ubuntu Dapper), and deployed from my MacBook using Capistrano with git as the source repository. I've been wanting to do a lot of this for a while, but never seemed to have the time. The crash gave me the motivation, and the necessity.
This article is to let me capture what I can remember while it's still fairly fresh in my memory, and hopefully provide help to others with similar goals.
Installing Passenger on Ubuntu 8.10
A bit of help from my friend google, led me to this how-to which I followed. I'm not entirely happy with the approach, since it installs Ruby using the debian package, and I generally prefer to install it from source. I also thought that it installed Rubygems from a package, but looking at it now it seems to have installed it from source, but installed in in /usr/bin which may not be the wisest place on a debian system, it really should be in /usr/local/bin or somewhere else that dpkg/apt-get doesn't muck with. I might need to straighten that out.
I had to hack his scripts a bit as well. They assume that the current directory is on the path, which is something I don't like to do, so I went through and added './' where necessary.
Finally, his script for upgrading to enterprise Ruby for improved performance didn't work, so that's a job for later.
I posted a few comments on the article about this, but as of now they are waiting for moderation.
Typo Upgrade
For those who haven't used it typo comes a a gem which provides a typo command, you use it to either install typo
typo install typo_dir
Either installs the latest version of typo, which is really just a rails app, in typo_dir, or upgrade an existng version.
One of the first things the upgrade does is to backup the database to a yml file. For a blog like mine with a lot of content, this took forever. As I sat there waiting, it occurred to me that I already had a mysql backup, which I'd already used to recreate the database, so I killed the upgrade and hacked the rails-app-installer gem (which typo uses) to skip the backup.
The next hurdle was a gem version problem. I got to the point where the upgrade script was trying to migrate the database, but Rake was failing
Migrating Typo's database to newest release rake aborted! RubyGem version error: actionpack(1.13.3 not = 1.13.6)
This was particularly strange since Typo 5.2 runs on Rails 2.2. Somebody was loading the wrong version of rails, and then complaining about the wrong version of actionpack. And the 1.13.6 version of the actionpack gem doesn't seem to be around anymore, probably left behind when Rails moved to github.
So I asked on the typo mailing list, and my former co-worker Ben Burdick gave me the solution:
sudo gem install datanoise-actionwebservice --source http://gems.github.com
This got me up to the point where I could run the blog on the MacBook. I used the very nice Passenger pane, you just drag the folder containing your rails project into the pane, and restart, and your app is running! Since I wanted to urls to match up. I told passenger pane to use a url of talklikeaduck.denhaven2.com instead of something.local, and put an entry into /etc/hosts to point that name to local host.
Once I was there, it was a matter of recreating the articles which had been lost since the last backup. Mark Imbriacco emailed me what he had cached for each article in his RSS reader. It turned out to be a pretty simple job of cutting and pasting, and telling Typo to publish the articles on their original dates which could be determined by the permalink urls. I didn't worry about the exact time.
I then did a mysqldump of the database, shipped it over to the server with scp, and recreated the database on the server.
Capistrano Deployment
Now the task was setting up Capistrano and having it get the code from git. For the time being I set up a git repository in my home directory on the server, and pushed the code to it from the MacBook over ssh.
The next step was to allow the apache user (which is www-data on debian based systems) to access the repo. This was a matter of generating a public/private key pair in the www-data users home directory, adding the public key part in the .ssh/authorized_keys file in MY home directory, and a
sudo su - www-data ssh rick@localhost # answer yes to the prompt to add the host being connected to exit
I then upgraded my capistrano gem and capified my typo project. Then I edited the config/deploy.rb based on a couple of blog posts and some trial and error.
set :application, "talklikeaduck" set :repository, "ssh://rick@aaa.bbb.ccc.ddd/home/rick/git/tlad.git" set :scm, :git set :branch, "master" set :deploy_via, :remote_cache set :user, "www-data" set :runner, "www-data" set :deploy_to, "/var/www/rails/#{application}" role :app, "aaa.bbb.ccc.ddd" role :web, "aaa.bbb.ccc.ddd" role :db, "aaa.bbb.ccc.ddd", :primary => true namespace :deploy do desc "Restarting mod_rails with restart.txt" task :restart, :roles => :app, :except => { :no_release => true } do run "touch #{current_path}/tmp/restart.txt" end [:start, :stop].each do |t| desc "#{t} task is a no-op with mod_rails" task t, :roles => :app do ; end end end desc "Link in the production database.yml" task :after_update_code do run "ln -nfs #{deploy_to}/#{shared_dir}/config/database.yml #{release_path}/config/database.yml" end
Note that I've replaced the lan address of the server with aaa.bbb.ccc.ddd, it probably wouldn't be a security exposure to show the real address behind the nat, but paranoia is paranoia. Normally I'd use a local dns server name, but I haven't gotten around to setting that back up.
Getting this working was a matter of trying cap deploy:check until I ironed out permissions issues, usually by sshing into the server creating files using sudo, then chowning them to www-data.www-data, since www-data doesn't have sudo privileges.
Then I worked on getting cap deploy:cold to work. More sshing, mkdiring, and chowning to set up the current, releases and shared subdirectories. I'd scratched my head a bit about what to do with config/database.yml which I'd gitignored as usual. Ben pointed me to a blog article which added a post setup task to upload it to the shared directory, but it turned out to be easier to just do it via scp and ssh.
If I recall correctly, the next hurdle was familiar, cap:deploy cold was failing for the same reasons as on the Mac, the gem version problem killing Rake. The solution here was to rake gems:push, git a
rake gems:unpack git add vendor/gems git commit git push origin cap deploy:cold
Now I ran into a problem because I'd missed a bit of code in the deploy.rb from the article I was using a model. It was this piece:
[:start, :stop].each do |t| desc "#{t} task is a no-op with mod_rails" task t, :roles => :app do ; end end
Without this capistrano was trying to get the www-data user to sudo to run script/spinner. Before I noticed the omission I was considering writing scripts to stop the server by disabling the virtual host, and reloading apache, and starting in in a similar manner, and giving www-data limited sudo privileges to run those scripts. It turns out that this is unnecessary using Passenger, so it was just a matter of overriding the default capistrano tasks to do nothing.
So now there was joy in Mudville. The cap deploy:cold worked. At this point I created a vhost configuration in /etc/apache2/sites-available, used a2ensite to enable it, and reloaded apache. Now to check it out.
One Last Wafer Thin Mint
Back on the MacBook, I edited /etc/hosts to now point to the servers lan address. I still wasn't ready to expose port 80 of the server to the internets.
I cranked up my browser and reloaded the url. Everything was working! But being the suspicious type, I wanted to make sure. Tailing the log on the server showed no acivity, hmmmm?
I pinged talklikeaduck.denhaven2.com on the MacBook, and it was pinging 127.0.0.1. I knew that I'd changed /etc/hosts. Ben told me over ichat to try dscacheutil -flushcache, which seemed to have no effect. So I rebooted the MacBook, and it was still resolving talklikeaduck.denhaven2.com to localhost.
At that point I tried:
$ dscacheutil -q host 127.0.0.1
Which produced:
name: talklikeaduck.denhaven2.com ip_address: 127.0.0.1 name: localhost ip_address: 127.0.0.1 name: talklikeaduck.denhaven2.com ip_address: aaa.bbb.ccc.ddd
So there were two cache entries, and the local host one was winning.
I finally got a clue when I grepped /etc for references to talklikeaduck, and found one in /etc/hosts, but also one in /etc/apache2/sites-available/tlad which was the vhost configuration generated by Passenger pane. When I removed the blog project directory from the Passenger pane, and hit restart, then did the dscacheutil -flushcache, the name was now resolving to the server.
Now I was hitting the server, and I did a little more prodding to make sure things were working, including writing my resurrection announcement.
Now I was finally ready to unblock port 80 on the router.
Hopefully someone will find something useful in this post, at least I've captured the lessons I learned from the exercise.
I've been using git for some months now, trendsetter that I am!
I've used quite a few source code/versioning/configuration management tools in my day, including, in roughly reverse order, svk, subversion, cvs, envy (for Smalltalk), and CLEAR/CASTER. Git has continued to amaze me with its power and flexibility.
Of course, with all that power and flexibility, there's a lot to learn, and sometimes "proactive interference" causes me to type:
git revert some_file.rb When I really meant:
git checkout some_file.rb because SVN habits die hard. Fortunately, the first command doesn't do anything since git revert takes a commit and not a path, so it just interrupts my thought processes a bit.
A recent tweet made me aware of EasyGit a wrapper for the git command which tries to make git act more "naturally." I guess that this is good stuff for some, but I don't think that it's for me. I'd rather take my lumps in learning something new, whether it's a programming language, a human language, or a tool. Otherwise I find that I end up with the equivalent of "speaking" with an accent. On the other hand, the documentation for EasyGit looks like it will be worthwhile to digest in order to understand the differences between git and, say SVN, a bit better.
Here are some other resources I've found helpful in grokking git.
- Peepcode's Git Internals PDF
- Very good coverage of git, how it works and most importantly how to use it. Written by Scott Chacon who wrote Grit, the git 'engine' written in ruby that powers github.
- Git from the Bottom Up
- Another guide to git focusing on how its implementation informs on how best to use it. Less comprehensive than the peepcode PDF, but it's ten bucks cheaper, i.e. it's free.
- GitReady
- A fairly new web resource, it provides a new, well written tip on git daily, for beginners through advanced git users. There's an RSS feed so that you can get reminders of new articles.
Finally, much as I love most of the offerings from the Pragmatic Programmers, I found Travis Swicegood's "Pragmatic Version Control Using Git" just a little disappointing. I'm sure that many will find it useful, but IMHO, it comes across as a rewrite of a generic version control book, even though it really isn't. Somehow, the things that make git git, rather than svn++ just don't seem to come across. But, hey, that's just my opinion.




