Tribune DataViz

Matters of interest, from the data reporters and developers across Tribune Publishing

Archive for the ‘Infrastructure’ Category

Advanced django project layout

with 24 comments

Default django project layout versus news apps project layout

Default django project layout versus news apps project layout

We’re releasing our project layout for Django, based on Gareth Rushgrove’s lovely django-project-templates. If you’ve found yourself unsatisfied with the default layout, or you’re using our fabfile or ec2 image, you might be interested in using our project layout.

The default Django project layout makes it dead simple to learn the framework and get an application up and running. But it can quickly get cumbersome as your application grows and you have to figure out how to handle deployment. A few projects, most notably Pinax, have their own ways to organize large projects.

Here are the things that we need that the default layout doesn’t provide a solution for:

  • Separate settings, Apache configuration files and WSGI handlers for local development, a staging server and a production server.
  • A separate place for the various primary source data files (CSV, JSON, shape files) we typically have in a project.
  • A place to put Django apps that does not clutter up the root directory of the project.
  • A library directory to keep various reusable helper functions that are not Django applications.
  • A template directory and media directory for the entire project.

Gareth’s project is really well organized and addresses all of these issues. We tweaked his templates to match our use case.

Getting off the ground

  1. Clone my fork of django-project-templates.
    git clone git://github.com/ryanmark/django-project-templates.git
  2. Install the templates. It will install the dependencies: PasteScript, Cheetah and Fabric. You may want to use a new virtualenv.
    python setup.py install
  3. Create a new project from the News Apps Paste template.
    paster create --template=newsapps_project example_project
  4. You’ll be asked for staging and production domains, a git repository location and a database password. These setting will be put in the fabfile and will be used for deployment. You’ll also be asked for a secret key which is used internally by Django. It’s okay to press enter and accept the defaults. The template will still get created, you’ll just have to edit the fabfile later if you plan on deploying the project to a staging or production server.

The template contains a lot of personal preference but it’s been very useful for us and  handful of projects. We are all quite satisfied with it. Take it, use it, tell us what you think!

Advertisements

Written by Ryan Mark

March 8, 2010 at 2:30 pm

Posted in Infrastructure, Python

Our GeoDjango Amazon EC2 image for news apps

with 12 comments

UPDATE: For our friends in the EU and other interested parties, here is the recipe for building the AMI from the original Ubuntu community image.

Today we’re happy to make public a version of our Amazon EC2 image. It’s Ubuntu Karmic running Python 2.6, Apache2+WSGI, PostgreSQL+PostGIS, Memcached and Pgpool. The image is built on an excellent Ubuntu community image, so it supports all the same user-data goodies.

Be sure to check our sample GeoDjango application, built to run on this stack!

Launching the image

Start-up a new instance of the Amazon EC2 AMI ami-ff17fb96.  If you don’t know the answers to any of the questions in the launch wizard, you can simply accept the defaults, but take note:

  • If you make a new key pair, be sure to keep track of where you save the *.pem file you download, because you’ll need it later to connect to the server.
  • If you make a new security group, be sure to configure the firewall to permit HTTP and SSH connections from your IP address. If you’ll be using this image to serve something to the world, allow HTTP connections from 0.0.0.0/0 (that is, from anywhere on the internet). For best security, limit SSH to as few IP addresses or subnets as you can.

Connecting to the Running Instance

Once the instance is running, connect to it using your key-pair private key and the newsapps user, not root.  There may be a brief period after EC2 tells you the server address where you still get a Connection refused message. Just wait a minute or two.

~$ ssh -i $EC2_KEYPAIR_PRIVATE_KEY newsapps@$EC2_SERVER_ADDRESS

Initializing the instance

Once logged into the instance, you’ll notice a few things in the newsapps home. directory.

~$ ls
logs  make_appserver.sh  make_dbserver.sh  make_kitchensink.sh  sites

The scripts will configure the server for one of three modes. A front-end application server, a back-end database server, or a monolithic server that runs both the application and the database.

This post will just cover setting up the instance as a monolithic server, with a post on a multi-server configuration coming at some point.

Run the kitchen sink script. You will be walked through the setup process for the server.

~$ ./make_kitchensink.sh

First you’ll be prompted to create a key pair for the newsapps account. It’s best to provide a password for a key pair, but it caused problems for our automated deployment so we left it empty. Once you get through the password prompts, you will be shown the public key for the newsapps account.

We use the public key for GitHub and Unfuddle so that we can git clone our app directly on the server. You might use this key if you need to securely connect to your repository for secure deployment or for automated ssh.

Generating cert...
Enter passphrase (empty for no passphrase):
Enter same passphrase again:

Here is this server's public key (for git deployment)
ssh-rsa XXXXXXXXXXXXXXXXXXXXXXXXXXX newsapps@domU-XX-XX-XX-XX-XX-XX

You’ll be prompted for your public key. This will be the key from your development machine. Just copy and paste into the console. It’s usually located in your home directory ~/.ssh/id_rsa.pub. The script will add your public key to the server’s authorized keys so you can ssh and deploy to the server without having to provide your Amazon private key.

Enter your machine's public key (for fabric deployment)
ssh-rsa XXXXXXXXXXXXXXXXXXXXXXXXXXX rmark@hurley.local

We use S3 to serve all of our static media content and recommend you do the same. It’s cheap and opens up your EC2 instance to handle more traffic. The Apache configuration on our instance has keepalive turned off. If you want to use your EC2 instance to serve media, you should setup another web server, like Nginx or Lighttpd, to serve media separately. You can turn keepalive back on, but it’s not recommended.

You’ll be prompted to setup your S3 credentials on the server. As part of our fabric deployment, static media is pushed from the server to S3 using s3cmd. You’ll need to fill this out for the script to finish setting up your server. (You can enter bogus info now if you’d like, and reconfigure it later by running s3cmd --configure.)

The s3 configuration will also prompt you for an encryption password. This can be left blank or be anything you like. It’s not something you’ll need to remember, it’s just something random that helps encrypt your traffic to the server.

Accept defaults for the rest.

Configuring S3 caching...

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3
Access Key: XXXXXXXXXX
Secret Key: XXXXXXXXXXXXXXXXXXXXXXXX

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP and can't be used if you're behind a proxy
Use HTTPS protocol [No]:

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't conect to S3 directly
HTTP Proxy server name:

New settings:
  Access Key: asdf
  Secret Key: asdf
  Encryption password:
  Path to GPG program: /usr/bin/gpg
  Use HTTPS protocol: False
  HTTP Proxy server name:
  HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n]  Y

Save settings? [y/N] Y

The script will now enable all services and get everything started. If you want to tweak the image, you can start from our snapshot: snap-e9c69d80.

We included Postfix in our server stack, but it’s disabled by default because configuration is kind of complicated and most mail servers do not accept email from EC2 servers for spam reasons. Run sudo dpkg-reconfigure postfix to configure Postfix before running it.

Take it for a spin

We’ve also just released a sample Django application to illustrate how you might put this thing to use, including our project layout, some basic geographic operations, and our Fabric deployment process. Try it out!

Written by Ryan Mark

February 17, 2010 at 3:37 pm

Posted in Infrastructure

Nerd post roundup and preview (plus: our server architecture!)

with 7 comments

Lots of folks have asked us about the tools we use to build and deploy applications at the Trib. Well, now that we’ve got little time to breathe between the primaries and the build up to the general election in November, the team has been working on ways to share what we’ve learned.

So far, we’ve written about a handful of solutions to specific programming and deployment problems:

Bonus mini-post: Our production server architecture

Our production environment runs on two Amazon EC2 small instances running Ubuntu Linux. One’s our web server and runs Apache + mod_wsgi, Memcached and Django. The other handles data and runs PostgreSQL, with pgpool (indispensable) to manage connections. We can spin up additional, near-identical (minus Memcached) web servers at-will when the load gets too great, but so far, it’s not been necessary.

Our staging environment is *exactly* the same. Running a mirror-image staging environment enables us to do things like test heavy load in a controlled situation. The idea is that we never want to be surprised when it’s time to roll to production.

One final trick is that we host all static content (images, javascript, CSS), plus the widgets, etc., that get served on extremely high-traffic sites like the Chicago Tribune home page, on Amazon S3. It saves our little web server a lot of unnecessarily hard work.

And that’s pretty much it. (Forgive me the lack of a diagram. I always get hung up choosing the perfectly puffy cloud clipart, so I decided to skip it altogether.)

At $0.085/hour for a small instance (~$14/week), plus nominal bandwidth and S3 costs, we burn around $400/month in hosting expenses. If you’re not running an extremely demanding site like the Election Center, you could easily cut that in half by merging your web and data servers. (And if you’re really feeling cheap, you could shut down your staging server when not in use… but please, don’t forgo it altogether. You need a staging environment. Trust me.)

That’s it for now, but please, stay tuned! There’s some great stuff coming soon:

  • Using homebrew to set up your GIS stack on OSX.
  • Best practices in web development with Python and Django
  • And our server stack, complete with disk images and a recipe to get started!

Written by Brian Boyer

February 10, 2010 at 4:25 pm

Posted in Infrastructure, Python

Refactoring fabfile.py for fast, robust Django deployment

with 2 comments

Now that our Elections Center has seen peak traffic, the team has had some time to depressurize and take stock of what did and did not work well on our largest project so far. One area that I particularly felt could be improved on was our deployment strategy. Although it got us through the project, our routine deployments were beginning becoming cumbersome, and that was limiting the agility of our development process. To address this issue, I took on the task of refactoring our fabric deployment script. This led to several new features.

(For reference, we deploy to Amazon EC2–one database server and n web servers. All servers share one EBS volume that hosts the code. Static assets are deployed to S3.)

To go straight to the complete source of our fabfile.py, click here, or read on for the details.

  1. Requirements are now installed into a --no-site-packages virtualenv.
    def setup_virtualenv():
        """
        Setup a fresh virtualenv.
        """
        run('virtualenv -p %(python)s --no-site-packages %(env_path)s;' % env)
        run('source %(env_path)s/bin/activate; easy_install -U setuptools; easy_install pip;' % env)
    
    def install_requirements():
        """
        Install the required packages using pip.
        """
        run('source %(env_path)s/bin/activate; pip install -E %(env_path)s -r %(repo_path)s/requirements.txt' % env)
    

    By using the --no-site-packages flag to virtualenv, individual project’s dependencies become siloed from one another. This has the obvious advantage of allowing a new application to rely upon a newer version of a dependency (such as Django) without upgrading all other applications on the server. One less obvious advantage is that this forces us to thoroughly document all our dependencies in a pip requirements.txt file, which ensures that any developer can get a complete set of dependencies by executing pip install -r requirements.txt.

  2. Provided a mechanism for deploying an arbitrary commit.
    def rollback(commit_id):
        """
        Rolls back to specified git commit hash or tag.
        
        There is NO guarantee we have committed a valid dataset for an arbitrary
        commit hash.
        """
        require('settings', provided_by=[production, staging])
        require('branch', provided_by=[stable, master, branch])
        
        maintenance_up()
        checkout_latest()
        git_reset(commit_id)
        gzip_assets()
        deploy_to_s3()
        refresh_widgets()
        maintenance_down()
        
    def git_reset(commit_id):
        """
        Reset the git repository to an arbitrary commit hash or tag.
        """
        env.commit_id = commit_id
        run("cd %(repo_path)s; git reset --hard %(commit_id)s" % env)
    

    The version of our deployment script that we used during the development of the Elections Center included a versioning process that would copy the repository into a timestamped folder each time it was deployed. The newest version would then get symlinked to a directory called “current” which was what would actually be used in the server configuration.

    In theory this mechanism provides for bullet-proof rollbacks, however, in practice, this was rarely the case. A version of the codebase was not always associated with an up-to-date version of our datasets and even if it was there was no guarantee that dependencies or other configuration factors had not changed. Moreover, our git repository has grown quite large and the act of copying the directory was drastically slowing our deployments (not to mention maxing out the available RAM on our database server).

    For this new version the repository is itself the served directory and git provides the versioning. A rollback command allows us to deploy an arbitrary commit and we can manually tag known good versions. This provides a much more practical way to manage versioning without the overhead of copying the directory.

  3. An automated maintenance screen when executing a deployment.
    def deploy():
        """
        Deploy the latest version of the site to the server and restart Apache2.
        
        Does not perform the functions of load_new_data().
        """
        require('settings', provided_by=[production, staging])
        require('branch', provided_by=[stable, master, branch])
        
        with settings(warn_only=True):
            maintenance_up()
            
        checkout_latest()
        gzip_assets()
        deploy_to_s3()
        refresh_widgets()
        maintenance_down()
        
    def maintenance_up():
        """
        Install the Apache maintenance configuration.
        """
        sudo('cp %(repo_path)s/%(project_name)s/configs/%(settings)s/%(project_name)s_maintenance %(apache_config_path)s' % env)
        reboot()
        
    def maintenance_down():
        """
        Reinstall the normal site configuration.
        """
        install_apache_conf()
        reboot()
    

    In most cases our caching strategy ensures that a user never sees an error page during a deployment, but there are always edge cases. In order to provide the best user experience possible (especially during emergency deployments) we put in place a system that automatically throws up a maintenance page whenever the application is being deployed (this can also be controlled manually). This is accomplished by swapping in an alternate Apache config, which redirects all traffic to a static page.

Beyond these three major improvements to functionality our script now includes very thorough commenting and has been backported into our application template so that it automatically inherits into our new projects. One of the great features of fabric is that it uses Pythonic introspection to provide docstrings as help text. Here is the output of fab -l.

Available commands:

    branch                     Work on any specified branch.
    checkout_latest            Pull the latest code on the specified branch.
    clear_cache                Restart memcache, wiping the current cache.
    clone_repo                 Do initial clone of the git repository.
    create_database            Creates the user and database for this projec...
    deploy                     Deploy the latest version of the site to the ...
    deploy_requirements_to_s3  Deploy the latest newsapps and admin media to...
    deploy_to_s3               Deploy the latest project site media to S3.
    destroy_database           Destroys the user and database for this proje...
    echo_host                  Echo the current host to the command line.
    git_reset                  Reset the git repository to an arbitrary comm...
    gzip_assets                GZips every file in the assets directory and ...
    install_apache_conf        Install the apache site config file.
    install_requirements       Install the required packages using pip.
    load_data                  Loads data from the repository into PostgreSQ...
    load_new_data              Erase the current database and load new data ...
    maintenance_down           Reinstall the normal site configuration.
    maintenance_up             Install the Apache maintenance configuration.
    master                     Work on development branch.
    pgpool_down                Stop pgpool so that it won't prevent the data...
    pgpool_up                  Start pgpool.
    production                 Work on production environment
    reboot                     Restart the Apache2 server.
    refresh_widgets            Redeploy the widgets to S3.
    rollback                   Rolls back to specified git commit hash or ta...
    setup                      Setup a fresh virtualenv, install everything ...
    setup_directories          Create directories necessary for deployment.
    setup_virtualenv           Setup a fresh virtualenv.
    shiva_the_destroyer        Remove all directories, databases, etc. assoc...
    stable                     Work on stable branch.
    staging                    Work on staging environment

To get the complete source of our fabfile.py, click here. It depends on fabric 0.9.0.

Have suggestions for improvements? We would love to hear them in the comments.

Written by Christopher Groskopf

February 10, 2010 at 11:19 am

Posted in Infrastructure, Python

Fun with widgets and caching

with 3 comments

In order to spread the reach and usefulness of the Election Center, we made a few different types of widgets – bits of HTML and JavaScript that get dynamically generated and embedded into story pages, blog posts and section pages on ChicagoTribune.com.

Example of an Election Center widget on a story about Pat Quinn

The widgets need to load fast and not put load on our servers. They grab different content links and headlines depending on the context (i.e. A widget on a page about Pat Quinn shows stuff about the governor race).

So we built out a couple Django views to generate the JavaScript we needed. One view renders a JavaScript that determines the context and makes a JSONP call to another view, which delivers the appropriate HTML content wrapped in JSON. The widgets load across domains, so we cant use any of the typical AJAX magic. We have to use script tags to load the JavaScript.

Heres an example of the loader and the file it loads:

# here is the loader
# We start by including the css we'll need to properly display the widget html
var ss = document.createElement('link');
ss.type = 'text/css';
ss.rel = 'stylesheet';
ss.href = 'http://media.apps.chicagotribune.com/elections/site_media/widget.css';
document.getElementsByTagName("head")[0].appendChild(ss);

# Here we define the callback function for JSONP
function tribapps_draw_widget(data) {
    ele = document.getElementById("tribapps_widget_area");
    ele.innerHTML = data['content'];
}

# .....
# snipped a bunch of business logic to figure out what content to grab for this page
# .....

# write a div to contain the widget
document.write('<div id="tribapps_widget_area"></div>');

# figure out the slug of the content to grab
if ( window["categories"] ) {
    tribapps_slug = slug_for_categories(window['categories']);
} else { // try mapwords
    var keywords = tribapps_find_keywords();
    tribapps_slug = slug_for_keywords(keywords);
}

# make a JSONP call to get the appropriate content
if ( window["tribapps_slug"] ) {
    document.write('<script src="http://media.apps.chicagotribune.com/elections/widget/race/'+tribapps_slug+'.js" type="text/javascript" charset="utf-8"></script>');
} else {
    document.write('<script src="http://media.apps.chicagotribune.com/elections/widget/index.js" type="text/javascript" charset="utf-8"></script>');
}

And the JSONP response:

tribapps_draw_widget({"content": "<div>This is WIDGET!!</div>"})

We didn’t want the possibility of melting our servers by pointing the traffic firehouse that is ChicagoTribune.com at our small Amazon instance. Our solution was to pre-render all the JavaScript views and store the JavaScript at S3.

So I extended the cache busting function I wrote about a while ago. Instead of using Django caching, I used the wonderful Amazon Web Service library Boto to push the rendered JavaScript to S3.

The function cache_view_in_s3() will take a view function as a parameter, along with a few other optional params detailed below, and render the view and save the response content to a key on s3 that matches the view’s URL in urls.py.

Here is the code:

from django.http import HttpRequest
from django.core.urlresolvers import reverse
from django.conf import settings
from django.core.exceptions import ImproperlyConfigured

import re
import mimetypes

from boto.s3.connection import S3Connection
s3conn = S3Connection(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY)
from boto.s3.key import Key

def cache_view_in_s3(
		view_func,
		args=[],
		view_name=None,
		s3_url=None,
		request=None
	):
	"""
	This function renders a view, stores the response in s3, and returns
	the s3 url. The s3 url will resemble the view's Django url.
	  view_func: Pass the function for the view you want updated.
	  args:		 Array of arguments to pass to the view
	  view_name: (optional) The name of the view in your urls.py. In
				 case django can't find the view with just the function
				 reference.
	  s3_url:	 (optional) Base url to store the cached view. Defaults to
				 settings.AWS_S3_URL
				 Ex: "s3://media.apps.chicagotribune.com/elections"
	  request:	 (optional) An HttpRequest object to be passed to the view.
				 It'll make a simple GET request by default.
	"""
	if not s3_url:
		try:
			s3_url = settings.AWS_S3_URL
		except AttributeError, e:
			raise ImproperlyConfigured('AWS_S3_URL not defined')

	#mockup the HttpRequest if we didn't receive one
	if not request:
		request = HttpRequest()
		request.method = 'GET'
		if view_name:
			request.path = reverse(view_name, args=args)
		else:
			request.path = reverse(view_func, args=args)

	#call the view for content
	response = view_func(request, *args)

	bucket, key_prefix = re.match("^s3:\/\/([^\/]*)\/(.*)$", s3_url).groups(0)
	content_type = response['Content-Type'].split(';')
	if len(content_type) == 2:
		mime_type, encoding = content_type
	else:
		mime_type = content_type[0]

	if len(request.path) > 1:
		key_name = key_prefix.rstrip('/') + request.path
	else:
		extension = mimetypes.guess_extension(mime_type)
		key_name = key_prefix.rstrip('/') + '/index' + extension

	k = Key(s3conn.get_bucket(bucket), key_name)
	k.content_type = mime_type

	k.set_contents_from_string(response.content)
	k.make_public()

	return "http://%s/%s" % (bucket, key_name)

You’ll need to set the following Django settings:

AWS_S3_URL="s3://foobar.com/bucket/" # this is the root where the cache function should store static files.
AWS_ACCESS_KEY_ID=""
AWS_SECRET_ACCESS_KEY=""

Here’s where you can help – sometimes this function needs the name of a URL (view_name) in order to properly resolve the URL. Sometimes using just the function reference doesn’t cut it. I’m not sure why this happens. Any ideas?

Written by Ryan Mark

February 9, 2010 at 4:12 pm

Posted in Infrastructure, Python

Off-process cache invalidation in Django

with one comment

In preparation for our highest traffic project yet, we’re getting aggressive on caching. The site is going to be pulling a lot of content over web services and RSS, which would quickly break down under any kind of traffic. I moved the web-service-calling to a Django management command which will be run as a cron job and the command will shove the content in the database as a pickle. And we’ll just rely on the built-in middleware to cache views.

The fetcher cron job needs to run every minute or two and any change in the content needs to be on the site immediately. But views need to get cached and stay cached as long as possible to handle the traffic.

We could decrease the middleware cache timeout so that pages expire on the same schedule as the content fetcher. But what if only one page of content changed in the last hour? And that one change affected two pages? Every page on the site would have been rebuilt dozens of times!

What we really want is for all pages to stay cached indefinitely (or a very long time), and only the updated pages to get expired and refreshed.

I figured there had to be an easy way to do this leveraging existing code. This is what I came up with. It creates a fake HTTP request, calls the view and passes the response to the caching middleware, forcing it to update the cache. The function is used by the fetcher cron job, so updated pages will always get cached before a visitor hits the page.

from django.middleware.cache import UpdateCacheMiddleware
from django.http import HttpRequest
from django.core.urlresolvers import reverse

def update_cached_view(view_func, args=[], view_name=None):
    """
    Creates a fake HTTP request, calls the view and passes the response
    to the caching middleware, forcing it to update the cache. Assumes
    no special request headers. Really only should be used for public
    facing anonymous pages. This works if your using the middleware or
    @cache_page
    decorator.
        view_func:  Pass the function for the view you want updated.
        args:       (optional) Array of arguments to pass to the view.
        view_name:  (optional) The name of the view in your urls.py.
                    In case django can't find the view with just the
                    function reference.

    """
    # create a fake request object
    request = HttpRequest()

    # set request._cache_update_cache = True - Checked inside
    # UpdateCacheMiddleware
    request._cache_update_cache = True

    # Need to fake out the request. Only GET requests get cached, and
    # the middleware needs to know the request path to generate the
    # correct cache key.
    request.method = 'GET'
    if view_name:
        request.path = reverse(view_name, args=args)
    else:
        request.path = reverse(view_func, args=args)
    # call the view with the request to get the response
    response = view_func(request, *args)

    # call the UpdateCacheMiddleware to update the cache
    middleware = UpdateCacheMiddleware()
    response = middleware.process_response(request, response)

This function assumes a lot and will only work if you’re using caching in the most basic case. It also relies on the cache middleware working a specific way that could change in the future. But it’s a solution for a problem I’ve run into before, hopefully it will be helpful for you.

Written by Ryan Mark

November 2, 2009 at 2:08 pm

Posted in Infrastructure, Python

A quick primer on making software — best practices, tools and further reading

leave a comment »

This post was originally written for an un-conference talk that I gave at ONA this year. It’s a run down of our tools and practices we follow on the News Apps team — cross-posted from my blog.

“Imagine a news organization with only writers, and no editors. They might manage to crank out some successful stories, but without editorial controls, the failure rate would be astronomical.”
Me, a couple of months ago.

Why we do this

You don’t adopt processes because they’re fun, you adopt them because they have special ass-saving properties. Doing it the right way may seem heavy, micro-manage-y, but when the process sings, the unbearable weight of uncertainty is lifted from your shoulders. This is freedom through tyrrany.

A seasoned developer won’t find much of what follows particularly interesting. This is elementary, but it’s stuff that seemed worth talking about…

A few baseline requirements for anyone making and releasing software

Version control

Version control software is both a safety net and a collaboration tool. It’s a place, off your machine, to keep your code, and when you update the code, it keeps your previous version(s). So, even on a one-person project, it’s essential. When your hard drive crashes, you don’t lose your work. And, when you’re working with others on a single codebase, version control gives you a central repository to coordinate everyone’s changes.

Task tracking

Task tracking is not about micromanagement (or, at least it doesn’t have to be). You’ve gotta be able to see the tasks on the docket so that you can know how deep in the weeds you really are. Also, forgetting to do something is really embarassing. You can track tasks in a spreadsheet, but that’s not very visible to the team. Instead, go low-tech — 3×5 cards pinned to the wall — or high-tech — with one of many software packages designed for the purpose.

Defect tracking

When you find a problem: log a defect. Take a screenshot, and give sufficient details to reproduce the problem. Defects are your unplanned tasks, they must be addressed — either by fixing them, or choosing to let them slide as a known defect, which is totally okay. Unknown defects, on the other hand, are the devil. Always, always record your defects, even if the very next thing you’re going to do is fix it. You *will* be distracted. You *will* forget. Defects are pickier than tasks, and are best tracked with software.

Staging environment

Like defect tracking, having a staging environment is about reducing uncertainty. It’s an environment running in parallel to production, set up as identical as you can make it to the production system. (If you’re using Amazon EC2, this is pretty much as simple as copying your production instance!) Your goal is this: knowing that, if your application works in staging, it will work in production. You can run load and performance testing against your staging environment, test your deployment scripts, and, as a bonus, it makes for a nice place to demo your work before it’s finished.

Push-button deployment

When everything is running smoothly, a multi-step deployment process (grab the latest code, ftp it all to the server, restart apache, etc.) doesn’t seem like so much of a hassle. But when the shit hits the fan, folks are freaking out, and you’ve gotta redeploy, half-drunk on a friday night, you’ll screw it up. You’ll forget something, and your minor bug will become a total cluster****. But if you’ve got push-button deployment, you can’t miss. If you’ve got a identical staging environment, you’re even better off, because you can develop your deployment script for staging, use it a few dozen times, and then when it’s time to roll to production, you know it’ll work.

The tools we use

Further reading (please, read further!):

Written by Brian Boyer

October 23, 2009 at 11:12 am

Posted in Infrastructure, Python