Why is Change Management Hard?

No matter how much we hope otherwise the foundation of any security program are consistent and used procedures. This means figuring out what we need to be doing, sketching out how we think we should be doing it, finding out we were totally wrong and misguided thinking we could do it that way, then editing it into something that actually works.

I knew walking into this job that procedure¹ generation like this would be a necessary part of my job. After I got settled with the various aspects of the company enough to know what's going on I sat down to work on generating some of those documents. I banged out a couple without to much effort before landing on the topic of Change Management.

I really struggled with this one. In order for change control to work there must be a certain amount of formality. I've participated in environments where scheduling a change consisted of shouting out your office door, "Wiki is going down for maintenance upgrades!" This breaks down because someone might not hear you and 3 months from now when we discover backups had stopped working nobody will remember who did the work or when or whether the backup process had been checked. Too much formality can be just as bad. If the completely formal fully ITIL compliance change process requires scheduling a maintenance window 4 weeks in advance, rollout plan, rollback plan, test plan, peer review, and customer then your admin will be pretty bent when upgrading the wiki consists of running yum update -y wiki-soft. In this case don't be surprised if your entire process gets ignored for anything except super scary work.

Finding the right balance is really tough because, quite honestly, the level of formality depends on the overall risk impact of the change compared against the risk appetite of your business. However, writing your process in such a way as to account for differing levels of formality results in a very formal process!

Trying to walk this balance is tricky and is really making me think hard about what a Change Management program is trying to accomplish. At the core I finally settled on:

Limit the risk of conflicting work, i.e. multiple people doing different things at the same time that would affect each other.
Document what happened when so we can more easily troubleshoot problems.
Know when things are going to happen.

The benefits of the first two are pretty obvious to anyone who's worked in big shops before, the bigger the shop the worse it becomes. The last one is more about a holistic communication model and is primarily focused on the customers. If we know a service will go down for 45 minutes on Thursday we can shoot an email out to our clients or post it on a web page, this way if they try to use the service and it doesn't work they don't get freaked out. Or even more importantly the helpdesk, who are normally completely left out of things like this, can answer calls with a soothing and confident, "Our engineers are performing maintenance on that service, they expect to be finished within 15 minutes."

Fortunately we can address all those pretty informally since it all comes down to who does what when. Personally I lean a little more towards that end of the spectrum but the risk averse side of me really digs the comfort of a good roll out plan. I still haven't fully baked what we need but the more it stews the more I'm convinced that for my team less formality is better so as to remain more flexible. We aren't what I would call a properly Agile office but have definitely adapted some of those values to our work.

I would love to hear stories of how people have made change management work in smaller teams. Leave comments below or that new fangled Twitter thingamajigger.

I should really be calling this policy. I'm still purging myself of the Pavlovian responses drilled into me as a public employee so it's still difficult to say the word "policy" without thinking of a multi-year draft->publication process that involves as much horse-trading as a typical Senate bill. ↩