Migrating from one qmail server to another

If you've found this page you've probably been using qmail for a while. That means you've probably experienced the pain and suffering associated with migrating a site from sendmail.

What about migrating from one qmail server to another? Surely that can't be so bad?

Indeed not. I'm pleased to report that thanks to the cunning design of qmail's various component parts, a qmail->qmail switchover is in fact dead easy. Here's what you do.

Update 2005-05-13: You may wish to read Brian Glenn's alternative method for completing this procedure.

Assumptions

Let's say you're going to replace the server on old.example.org with new.example.org. You've already configured your DNS servers to give out an IP address for new, and they of course already give out the same address for mail.example.org, smtp.example.org, pop3.example.org and whatever else you want as for for old.

In ~qmail/control, I will assume that the contents of me are unique to each server. That is to say that on old you will have old.example.org in me and that on new you will have new.example.org.

For sake of argument, I will also assume that all your user's home directories live in /mail. The only reason I make this assumption is to make some of the commands easier to illustrate. As long as you know where everything is you will have no trouble moving it over.

The procedure

Stage zero

Before you start, it's important to configure both servers to accept mail for each other. If a message arrives at old with a recipient address that explicitly includes @old.example.org you don't want new to bounce it by the time it arrives there.

Add old and old.example.org to ~qmail/control/locals and ~qmail/control/rcpthosts on new. Add new and new.example.org to ~qmail/control/rcpthosts on old but not to locals.

Stage one

Duplicate the user accounts from old on new.

Make sure all the users on old who are supposed to receive mail have their home directories and maildirs set up on new and that the permissions are correct.

If you're using my MySQL patches you will need to duplicate the qmail database as well. Then you can run (as root) a simple script such as the following to set up the new accounts for you:

echo "select concat(\"maildirmake '\", home, \"' && chown -R \", uid, \":\", gid, \" '\", home, \"'\") from mailbox" | mysql -s qmail | sh

Watch those quotes!

Stage two

Set the sticky bit on all users' home directories on old.

chmod +t /mail/*

qmail will think the users are all editing their .qmail files and will queue all local mail instead of attempting to deliver it.

By doing this we create a "snapshot" of the server's state. Any mail in any user's maildir at this point is guaranteed to have been there before we started the migration.

Stage three

Change the DNS records.

Update the A records for mail and friends so that they point to the same IP address as new.

Now, most incoming mail will be delivered to the new server. Of course it will take a while for the DNS changes to propagate around the internet and your old server will still receive a few connections. Any local messages will be queued up (I'll talk about that later) but if you like you can have old start forwarding messages on to new straightaway.

To do this, remove old, old.example.org and example.org from ~qmail/control/locals. Remove ~qmail/control/virtualdomains if it exists. If you are using my MySQL patches, run the following two queries:

update alias set alias_host='new.example.org' where alias_host='';
delete from virtual;

To be on the safe side, you might like to run

insert into rctphosts select virtual_host from virtual;

before you do that but this shouldn't be necessary if you've been keeping your database in good order. It's just to make sure that any domains you'd forgotten to add to the rcpthosts table will still be accepted for relaying to new, since you don't strictly speaking need a rcpthosts entry for a domain in virtuals.

Stage four

Move mail from old to new.

One easy way to do this (if you have GNU tar) is:

cd /mail
tar c . | ssh new tar x -C /mail
[ $? = 0 ] && find -type -f -name [0-9]\*[0-9].old.[0-9]\*[0-9] -exec rm {} \;
ssh new chmod -t /mail/*

Alternatively you could NFS mount new and move the files over.

Now all the users' mail from old has been transferred to new (note we had to unset the sticky bit from the directories on new, which was set by the tar operation). You can very easily tell which mail on new has been imported from old because the messages in the users' maildirs will match the regular expression

/^\d+\.old\.\d+$/

Users making POP3 connections to old will wonder where their mail has gone but as soon as their DNS sorts itself out they will see their mail waiting for them on new, along with any messages that arrived while you were doing the changeover.

Those users who ignored your polite requests to trim their mailbox size from time to time will get a nice surprise when their multi-megabyte collection of mail shows up again in the new mailbox. UIDL won't save you now, mwahahahahahaaaaaaa!

Stage five

Unset the sticky bit on on all users' home directories on old.

chmod -t /mail/*

Local deliveries will now happen again. There will be some messages received while the DNS changes make their way round the world. These messages can now be delivered normally. Later, you can reset the sticky bit while you move these new messages over to the new server.

Stage six

Zzz...

Messages are still coming in. The remote queue still has messages in it. You can't power down old just yet. Wait for a sufficient buildup of new messages and move them to new (don't forget to set the sticky bit correctly on both servers). Check on the remote queue from time to time.

If you're still getting several connections it may be that someone stupidly configured their mail client to use the unique hostname (old) of your server or - even worse - its IP. You'll probably have to pester these people, if they're your customers.

Brian Glenns' alternative solution

Brian T Glenn writes to offer an alternative migration strategy that avoids having to do several passes of stickying user maildirs and moving message files to the new server.

It seems like you are setting the sticky bit on the old server and leaving it sit there while you copy users' mail over to the new server. This will cause all of that mail to queue up in the local queue instead of the remote queue. Once everything is moved and the sticky bit is released, all of that mail will be delivered locally to old.

A better option would be to remove the domain from locals and virtualdomains and set concurrencyremote to 0. This will cause the mail to queue up in the remote queue. This combined with an smtproute to the new server will ensure that all users' mail is delivered to the new server without any complex maildir or mbox merging.

This method would require that the new server was ready to service requests, but had the sticky bit set on the homedirs. This would cause the local queueing on the new server rather than the old. The mailboxes could then be tarred or rsynced, and once that is complete, both queues could be released for processing.

I don't think that multiple iterations of letting the mail queue up, delivering it locally and moving it over to new with the directories sticky is so terrible. It certainly worked for me when I did the work that prompted this article's creation. Brian's way does sound more elegant, however, as it avoids "trapping" mail in the old queue until the next batch of moving. I recommend taking his thoughts into consideration if you are planning a migration yourself.