At some point in the past few years, all of us were sold on the idea that boxing people into roles was not an ideal situation. It was not so much that we wanted every one to be a polyglot but that we wanted people to care about more than just what they had worked on. For example a developer should care about:

1. Page performance, error reporting etc when developing relevant parts of the application.
2. How the application she/he developed is performing in production.
3. How the infrastructure is created and maintained that serves as production.

so on and so forth.

What I have noticed recently (possibly could have existed long since then) is that while the developers are leaning towards understanding the state of the world as it exists post deployment, we are perhaps not allowing sys admins an opportunity to contribute pre deployment. The extent of their involvement as it exists is during the inception phase, designing the infrastructure (Thank you AWS) and possibly informing the development team of setting up endpoints for monitoring. At no point are we allowing them the opportunity to write code.

The unidirectional flow emerges from a reasonable assumption. Knowledge of how an application is going to be deployed and used is very beneficial for structuring the application itself at times. And a developer has a better chance of quickly pinpointing source of any and all errors that emerge upon usage. All these perceptions, and more have correctly started the thought process that a developer _should_ care and know more about things like deployment, monitoring etc. But its a two way street.

One of the other aspects of the DevOPS movement has been the refusal to silo people. And while the above reasons point out why silo’ing developers is bad, siloing ops is still happening to some extent in my opinion. I look around and I can count on one hand the number of sys admins who are writing code that does not have anything to do with an AWS stack, or a monitoring plugin. We are somehow still preventing operations from pairing on straight and simple business logic development. This is a case for many companies. A simple answer to this is usually an explanation of the workload that requires pure operational expertise. That usually happens when:

1. Sys admins are still treated as a separate pool of knowledge which feature teams regularly pull from in order to get work done. They are not embedded into feature teams as a resource for purely that team.

2. A business is doing so much work that it is impossible for a sys admin to be focussed on expanding her/his development skills. So they must concentrate on operational exercises and actually require help from developers to do the small hanging fruit while they do the more important stuff.

There are other reasons but the above are the main ones that I keep hearing. Both of these are not reasons but excuses that can be worked at in a similar manner to how we convinced developers that caring about prod is good. But the lack of that thought process is whats troubling. We are not even willing to think about our day to day work in a different manner which would in the long term allow sys admins to contribute more towards development.

A question was asked recently when I was discussing this issue with some people as to the benefit of a sys admin pairing on development that is not ops focussed. I see many:

1. Continuously working on different areas of the application would give a sys admin an in-depth idea of what was important and what was not. This can feed into monitoring prioritisation.

2. Applications can be structured better to deliver things like faster page times, better DB structures and more efficient use of external dependencies based on the experience and knowledge of the sys admin.

And many more.

There is also the case that we are neglecting the people who might want to learn more about software development. Not all, but there are definitely those who do want to. And if the reason why they cannot or don’t want to participate is that their workload does not allow them to, then that is an issue much bigger than propagation of dev ops culture. Hiring good sys admins is not hard. Hiring good engineers is hard. Hiring good engineers is hard and both sets of people, developers and system administrators fall into this category, IMHO, of engineers. If you simply wanted to hire anyone, thats not hard. Hiring someone good, well thats a different thing altogether.

I understand that there are obstacles to achieving the goal of helping sys admins contribute to writing non operational code but we should at least acknowledge that there is a gap and do something about it. Treat them as a any development resource just as we are starting to think of developers as possible operational resources.

Its time we made this stream bidirectional.

In the recently released realestate.com.au/share application, one of the most consistent exceptions we received was an “ActionView::TemplateMissing” exception because someone or something was hitting the following URLS.

http://www.realestate.com.au/share/share+accommodation-some-location.xml

http://www.realestate.com.au/share/share+accommodation-some-location.zip

The logical decision was to rescue these exceptions in the ApplicationController itself and serve the generic 404 page. So initially, we had a simple rescue block:

rescue_from "ActionView::MissingTemplate" do |exception|
  respond_to do |format|
    format.any {
      render "error_pages/404",
        :status => :not_found
    }
  end
end

This kept causing a TemplateNotFound exception because even though we were trying to respond to any format, it was still expecting the 404 template to exist in a zip or xml format. Having a symlink as 404.xml.erb or 404.zip.erb also is a bad solution because the expectation of the format is cascading i.e. any views rendered within the 404 would also be expected to have that format, and generally thats a bad idea. So I decided to simply override the request format in this particular case and save myself the trouble. Hence:

rescue_from "ActionView::MissingTemplate" do |exception|
  request.format = :html
  render "error_pages/404",
    :status => :not_found
end

I work for realestate.com.au and recently we launched a new section of our site called Share. Previously, it used to exist on a physical data centre as a Perl application but now exists as a Rails app in the cloud using AWS. The re-platform was achieved in mere 7 weeks and was a direct result of the hack day culture at realestate.com.au. But thats not what I am, at this moment, going to talk about.

One of the features of the re-platform was the build pipeline that we constructed. A suite of tests which led to building an RPM; which gets deployed to an EC2 out of which we bake an AMI which then forms the basis of our active and inactive production environments. But the main beauty of this pipeline was that a commit from the development environment (which in our case is our/my laptop) goes straight to the inactive production. And since anyone, and I mean anyone in the team can switch inactive to active, there is a very real chance that a commit made now goes straight to production. This has not been done anywhere in my company up until now and as far as I know, anywhere in Australia. The only other company I know that does that is Etsy (and they do it extremely well).

This aspect of development was certainly new to me. And this definitely produced a new way of working for me. The immediate affect it has is that I immediately began thinking of what was covered by good tests and how much. Seems simple doesn’t it! This can only be a good thing. No person should live behind the walled garden of tests written by someone else. A feature of our development is that every time work on a feature is done, and we do handover to a person doing QA, not only do we handover the completion of the feature, but also have to show the coverage of tests and have to justify the tests.

Another aspect that is developed in me was to be sure ensuring that I am always pairing on the most important business features of the application. When I am changing some technical aspects of the code, for example, when I am changing

some_array.map {|x| x.something}

to

some_array.map(&:something)

I really don’t need to have a pair as I am changing how something is calculated and what is calculated remains the same.

But when I am changing some core business feature, for example what determines a a listing is active, i.e.

def published?
  self.status == :active
end

becomes

def published?
  self.status == :active && self.expiration_time > DateTime.now
end

This is a very core change in business functionality and because I know that this change goes straight to production, I KNOW I want a pair to check and double check this. The temptation is to get over with it because it looks simple and hence when we work in the walled garden of other people writing tests, we change the code, and write a bunch of simple tests and here you go. And to be honest, there are times when I have been tempted with this behaviour. The idea of me typing

[~:] $ git commit && git push

And it going straight to production has definitely made me aware of how aware I have to be of the implications of my code changes, from trivial to the non trivial.

These may seem small things initially, but over the course of the project, this mentality helped us deliver a complete revamp and re-platform in only 7 weeks. I encourage whoever is reading this, to definitely try this, but if you are not used to this, then start small. Maybe from a local project and then show the benefits to your work colleagues. I assure you that this approach is definitely worth a try.

Recently a colleague of mine showed me how to find out what environment variables a process is using at the time its running. This works on Linux but not on OS X.

Find out the PID for the process you want to inspect

[@riyadh:~] $ ps -ef | grep mysql
mysql 1142 1 0 Jul06 ? 00:00:24 /usr/sbin/mysqld

Then its a simple step of catting the environ section of proc for that PID which in this case is 1142

[@riyadh:~] $ sudo cat /proc/1142/environ
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
TERM=linuxHOME=/etc/mysqlRUNLEVEL=2PREVLEVEL=NUPSTART_EVENTS=runlevel
UPSTART_JOB=mysqlUPSTART_INSTANCE=

On OS X, you can get information by using:

[@riyadh:~] $ ps -Eww $PID

Be warned, this gives you a lot more than just what environment variables that process is using.

The art of technical interviews is an old one, but not always a well practised one. There are always a few variables and a lot of theories. Should they be long? Many ? How about spread out? Or all in one day? Recently, we have had to conduct a few and I think I have learnt a few things from them. When I combine these ones with the ones I have given in the past, I have come to the following conclusions from the point of view of an interviewer:

  • It should be hard to pass an interview. You are not there to make the interview easy. Making the interviewee comfortable and making the interview easy are not the same thing, and too often the difference is glossed over. You are not there to be friends, and you are not there to be his or her mentor. Be nice, but blunt and if you spot a weakness in understanding, zero in and don’t let go till you have identified why that weakness is there.
  • Don’t assume anything. Don’t assume they know what a source control is! Do not assume they understand the difference between static and dynamic types! Do not assume they understand what your business model is! Ask ask and then ask some more.
  • Codility does not tell you how the person thinks. It tells you whether they can solve a programming puzzle quickly. You want to find out how a person thinks and whether there is decent base beneath the layers of programming on top. What you want is a person who can think for himself, learn and build on his understanding. What you do not want is a code monkey who likes typing characters. So create a simple coding test by all means, but rather than using codility, put it in a git repository which you then mail to them. Get them to solve the problem, test it and send it back to you in a day or two. Then you read the commit logs. That will tell you more about how a person thinks than codility. It also has the added benefit of demonstrating to you whatever it is you wanted from Codility.
  • When you get the person into your office, make sure they spend an hour pairing with an experienced developer on an existing feature that is being developed. This should be enough to show whether the person has had some pairing experience; whether the person can navigate while someone else is typing and point out errors; whether the person commits often, tests regularly etc etc. Most of all, it will tell you whether the person stops his pair to ask something which he does not understand.
  • Make sure that the person is a cultural fit. This is the most easy thing to stuff up as it is the hardest to check in a short amount of time. Which is why most companies should have probationary periods for new comers.

In the end, the worst case scenario is that the person you hired turned out to not be what you wanted and you will have to let them go! The only caveat at that point is that if you and the person are not a good match, make sure that it is not a surprise to that person, otherwise it is fair on neither of you.

It has been a long road to where we are at the moment. So many years of getting to know each other, the fights, the arguments and best of all, the make up deployment. All through those years we have wondered what we meant to each other. Sure, we lived together and occasionally saw each others dirty laundry, but so what; we needed each other. What we did not know was whether we loved each other. It has taken many years for us to answer that question.

We have had weird room mates in the mean time as well. Your weird cousin MySQL replication always caused problems with my brother db-migrations. You said migrations needed to be more idempotent and I said I hated MySQL! Things said in a rush of blood and forgiven the next day when we sat them together and got them talking.

What about the weird neighbour Mrs.Agile? Always wanting us to talk and sit next to each other and always pointing out to everyone how cute we looked together and always calling out loudly how we should really hook up. I remember your red face then! I remember my red face as well. How we awkwardly laughed, knowing deep down one day we would have to ask that question. We knew it even before that old lady would embarrass us in front of others.

Throughout all these years with you, I have known you in good times and bad and sometimes in downright crazy times. Not once have I forgotten how important you are to me and how much you mean to me. I want to deploy lots of apps with you and watch them scale with you. I want to spend the rest of my coding days with you growing old together.

My dear dear Ops, will you develop with me ?

I am, I want to be an engineer. I am not to be classified into cubicles called developer or operations. I don’t want to be a contractor or a consultant. I will not chuck my work over the wall to someone else to test and then move on! I will not inform someone with access to production systems of my migration at the end of an iteration. I will invite them to barbeques! I, Engineer!

I will keep my work transparent. I will talk to my tech leads and seniors and learn. I will say ‘I don’t know!’ when I don’t know. I will argue vociferously when I know I am right. I will listen to other people when they argue vociferously. I will compromise when I am not the only one who is right. I, Engineer!

I do not care what language I have to code in. I do not care what framework I have to use. I will do my best to deliver the best product that can be delivered. I will not rigidly adhere to a methodology for the sake of adhering to a methodology. I will not force a process on my team. I will encourage my team to try adapt processes to the ones that seem natural. I, Engineer!

I will automate my workload as much as I can. I will automate test suites and deployment and provisioning. I will not automate thinking. I will not outsource thinking. I will not contract out thinking. I, Engineer!

I will have a work life balance. I will go home and spend time with my family, a lot of time. I will not spend 20 hours a day at work. I will make sure that no one has to spend 20 hours a day at work.

I, Engineer

Some days ago Matt and I had to deploy a rails application. This was an application that both Matt and I had no idea existed about 2 days before we ended up deploying it so we were quite looking forward to it on a very rainy wednesday! (Why does it always rain when I am deploying???)

A day before Matt had tried to deploy the application to staging to see whether we could test it out before production got her hands on it, but the staging environment had not been puppetised for this particular app. When we realised that deployment to staging might not possible, we spoke to our resident friendly release manager Aaron and laid out several plans. The first action was for me to test the shi^H^H^H hell out of the app locally (with production data) as much as I could while Matt tried his best to puppetise staging.

Early wednesday morning, I managed to test the app completely and ensured that the changes some awesome developer *cough* had made worked. Matt had also finished his spike with staging but no goodness. So we both looked at each other and decided to deploy it, to production herself.
Four hours, a plate of sushi and a coffee later, the thing had been deployed and we had ensured that it worked in production and informed the relevant parties!

From this all, I have learnt a few things:

1. Even though we both knew nothing about the app, I ensured that Matt was involved in it from the very beginning, and it helped a great deal. Springing deployment surprises on OPS is not cool and helps no one, least of all you as a developer. JIRA and Zendesk are cool, but nothing, nothing will actually substitute going over and talking to your fellow OPS people.

2. Puppet. Puppet is cool. The fact that we could quickly spike things in staging and decide whether to deploy there or not was simply because puppet allowed us that level of automation. Invest of in such tools, whether they be puppet, chef or babushka

3. We did not panic and delay the release when we found out that we could not test it in staging. We sat with Aaron and figured out the risks of each scenario and associated rollbacks. We did as much testing as we could do before hand and then went ahead with the deployment. To quote my colleague Jon Eaves, “Fear is the mind Killer”.

Above all, it was great fun to work with Matt, and deploy something along the way :). This has made me realise that software development, and deployment, is amazingly rewarding when done with awesome people.