Builds & DevOPS – { Mujtaba Hussain }

Monstrous Builds and DevOPS!

On the second and final day of the YOW 2010 conference, there were a few management oriented sessions in the evening, a few technically descriptive ones in the morning and two very nice focussed ones in the middle. These were the DevOPS session presented by Michael .T. Nygard and one about taming large builds by Chris Mountford

Chris started talking about how they worked at Atlassian, particularly on JIRA and how they have problems testing and managing the tests. He showed us a few screen shots from many of the builds and frankly they were monstrously large, especially the one about JIRA, which between the time he had made his presentation and presented it, grew larger. He mentioned some very nice techniques that could be used to prevent and/or maintain monstrously large builds as they are very bad. Why ? Because builds are there to give you better feedback and quick. A build that takes 10 hours is of no use in an Agile world. Another thing, which might interest the business owners more, is that large builds have large turn around time and consequently more time needed by the developer to fix the build, which is less time she or he is spending on the work that the business wants done. The one thing I figured that Atlassian does that we don’t do are Canary Builds. But soon, I think we will. A very good session that was very informative, and slightly reassuring that we are not the only ones wrestling with huge monstrous builds.

It was refreshing to have another session which based on actual events and the conclusions and/or lessons that could be and were drawn from them. I have seen many speakers over the last two days conjure up scenarios to fit the example on the upcoming slide, but this was very nicely structured around real world events, with pictures as proof. The whole session was basically an operational and development postmortem of a few events and why those events demonstrate the invaluable aspect of no barriers between developers and operations.

Then Michael went on about how they diagnosed incidents along with developers after chaotic incidents that led to downtime. How they went step by step through any causes and how they actually constructed a KanBan wall without realising it. How process went out of the window and how that had both good and bad consequences. He also reflected on how that defined, to an extent, the intersection of the roles a system administrator and a software developer.

All in all, of all the talks in these two days, these two are the ones I have enjoyed the most. Practical, simple, and extremely well presented.