Blog


.

A Message for Drupal Developers: Know Thy Database

November 30th, 2011

Content Management, Technical

One of the challenges when developing a Drupal site is coordinating development tasks between multiple developers. Frequently, a single developer is responsible for building a site from concept to completion, including both front-end and back-end code. This model is great for smaller, simpler sites; in many cases a good Drupal developer can finish a site in a few days. However, it fails miserably when multiple developers are required to work on a site. In one way or another, this can be traced back to a common problem: improper database management. Understanding the role of the database is critical to creating a successful development, staging, and file versioning process.

The Drupal development process

Let’s start by defining each work stream that occurs throughout the Drupal site building process:

  • Site building is the process of using the Drupal UI to create content types, fields and views, and configuring core and contrib modules. Depending on the Drupal experience of the themer, building may also involve creating tpl.php files, preprocess functions, and other changes to template.php.
  • Site theming is the process of creating a Drupal theme, which includes the following tasks:
    • Creating tpl.php files and regions
    • Modifying a theme’s .info and template.php file
    • Adding .js and .css files as required
    • Updating views
  • Custom module development (CMD) is the process of creating a new Drupal module for a specific piece of functionality, which includes using Drupal’s hooks and APIs, writing PHP functions, and integrating with external systems. Module development often happens when site building, and while there is no hard difference between site building and module development, a good rule-of-thumb is that CMD should be defined during the SOW and/or requirements phase.

Site development weaves through each of these steps depending on the developer’s skill set and knowledge of Drupal; a Drupal developer may engage in theming tasks and vice-versa.

The art of synchronizing databases

Q: What did Drupal say to the database?

A: Wanna hook_update?

As Drupal developers, we know it’s easiest to enable and configure modules and basically build a site through the Drupal UI, pushing codebase + database from one place/person to the next. However, this presents a set of problems when multiple developers and themers are working on a site, especially after a site is in production. So, the question becomes:  how do you merge changes without pushing a database?

If you’ve worked with Drupal for any amount of time, you’ve probably heard of Features, Strongarm, and context, which allow you to capture and package configuration settings (ie. exportables, blocks, variables) in code as modules. You can also use install profiles, including hook_update_N and hook_install, for those configuration settings you can’t capture through features. Install profiles are also good for syncing databases between local and dev environments. You may also want to check out Spaces.

After a site has gone into production, best practices are to leave the prod database alone. The only time you should push a database from dev -> QA -> prod is when the site is offline and under maintenance (even then, a Drupal snob like myself would say pushing a db isn’t necessary if it’s done right; you get the point). This means a local database should be a mirror of the production database but with changes from enabled modules and install profiles.

A good follow-up discussion would revolve around how these practices apply to the use of Aegir.

Drupal staging and file versioning

The concept of reusable code in Drupal is closely related to staging and file versioning. Staging refers to the process of developing code and pushing changes to production. A file versioning system (CVS, SVN, git) is utilized to manage the staging process, used to push code locally to dev->QA->prod.

This brings us first to the topic of database staging best practices. Database management is tricky when multiple developers are working on a site. Having one developer and one themer is hard enough, especially if databases are pushed from one developer to another; development becomes very linear and impossible to branch. Having a shared development database (ie. a primary database where all developers use the same database to implement changes) can be useful, allowing developers to test changes on their local environment before pushing to dev. However, this presents another set of problems; reverting the primary database to a previous version deletes other people’s work. This also comes into play on a production environment, where pushing a database to production results in a loss of data (eg. log files, user-created content, ecommerce transactions, etc).

For the record, I don’t like pushing databases from one environment to another (ie. from a local install to trunk). This results in a loss of data and is impossible to branch and merge. However, I recognize this is a common practice and, in some cases, drastically reduces development time; pushing a database is much faster than writing a module that does the same thing. The major pitfall is that only one developer can have read/write access to the database of record while a database is pushed (keep in mind that “pushing a database” can take a few minutes to several days). If we are to push databases, we should only do it during development and have a process for “checking out” a database.

Back to versioning: remember to commit and update to HEAD often, and talk with others when you’re thinking of pushing a database.

In Closing

Once we move to a code-centric development process, we can start to understand staging best practices in Drupal. Feel free leave a comment below regarding anything Drupal. I’ll be sure to get back to you.

RJ Townsend

.

Tags: , , ,

2 Responses to “A Message for Drupal Developers: Know Thy Database”

  1. There isn’t a magic bullet to this problem, but we’ve approached using (what I termed) Yin/Yang Staging: http://drupal.org/node/942540

    We created two webtrees called “yin” and “yang”; one is mapped to production and the other to development. One script backs up the old development environment and then repopulates it from production. A second script can instantly switch which is production and which is development (by changing two symlinks). This way, we’re always sure we’re not losing any data or running the risk of overwriting the production environment.

    We use this system in two ways: first, we can always test proposed changes in a sandbox and then reapply them manually to the production environment knowing that they won’t cause problems. If we need to make major changes, we make the changes to the development environment and then switch it to be the production environment. And if we discover any show-stoppers, we can quickly back-out by just running the script again. It’s worked pretty well for us and has allowed much more active development that would otherwise have been feasible.

  2. RJ Townsend says:

    “There isn’t a magic bullet to this problem”

    I’m *really* hoping this will be solved in D8; content (ie. database) staging is on the list for Dries and D8, http://buytaert.net/starting-to-work-on-drupal-8. The main issue we deal with, and why we’re pushing for a “code only” approach to Drupal, is that we want to be able to version everything through SVN/git. This is critical (for us) to managing development between multiple developers and staging environments, especially when developers are in multiple locations.


Post a reply

Required Fields *




  • May 2012
    M T W T F S S
    « Apr    
     123456
    78910111213
    14151617181920
    21222324252627
    28293031