Software Development Guidelines

From Merrian-Webster:

guide·line (noun): an indication or outline of policy or conduct

U.S. Dept. of Veterans Affairs:

A guideline is a statement by which to determine a course of action. A guideline aims to streamline particular processes according to a set routine or sound practice. By definition, following a guideline is never mandatory. Guidelines are not binding and are not enforced.

These definitions are very important. What I am listing here are my guidelines, a set of items that usually drive my particular style of coding and engineering. They are not rules. As someone once wisely pointed out (no references, sorry):

“guidelines account for judgement, rules don’t”

They are also my guidelines. It is ok if they do not fit in your case, I am not trying to describe the definitive way of writing, maintaining and operating software. Most of the items here should be obvious for many people, but I hope this helps you think about your own guidelines and understand more others.

After a quick disclaimer, let us go to the list. There is a lot more I would like to publish someday, but I will try to not make this too long and painful.

Extremism

There is a big chance that ying-yang is in my blood; my father is Chinese. One of the most important lessons I learned so far is that extremism is bad, everything needs balance.

This is at the top of my list, mainly because it nicely applies to the other items as well. I consider them to be good ideas, but I will not just blindly apply them to every situation. I usually do, unless I can come up with a very good justification not to.

The recursiveness of this item is also beautiful. It means that you can even be extreme (passionate?) about something, as long as you can give a damn good reason for it. You should not be extreme about not being extreme.

Take for example my deep hate for inheritance (in the OO context): there are good reasons for it. But it is not true that I would never use inheritance: the fact that it should be avoided is something to keep in mind, not something to block you from delivering software.

While we are on the topic, I really like how Go approaches inheritance.

Enabling vs. Directing

I took this very important lesson from some good discussions I had with my friend Tiago many years ago, back when we were coworkers in Germany.

Enabling means that something can be used in many different ways. Even in ways that we have not even considered yet. Humans are creative and will come up with different ways of using and applying an enabling idea. Directing however, means something that is designed to be used in a particular way, or an specific action.

A good example is how to design Java annotations. Here is how we could annotate a class to define that its objects need to be saved to a database (i.e. they are persistent):

@Save
public class Person {
  // ...
}

This is an example of a directing implementation, it specifically tells that objects of this class need to be saved. Using that information for anything else would be awkward.

Here is how the Java Persistence API (thanks to Hibernate) defines it:

@Entity
public class Person {
  // ...
}

This information could be used by many different components on a system. For instance, a logging component could read and use it to determine that some of its fields need to be filtered from logs (such as passwords), just because persistent entities usually contain sensitive data. A formatting component could use different colors for entity objects. There are all sorts of different uses for that information and creative people will come up with even more.

It does not mean that all of those uses will be correct, but that enables more from your code and makes it potentially more flexible.

This concept applies to almost any decision we need to do related to software development. Think about shipping a new feature, designing a component, planning an api, coding styleguides, people and project management styles, technology (framework?) choices, etc. There is probably a more enabling than directing way of doing them all in each context, which would empower people more.

In practice, it can be very hard. I have experienced many cases where it was really hard to tell if we were directing or enabling people. Keeping this in mind already helps a lot though.

Premature Optimization

This has been discussed a lot already, no need to talk too much about it. We all know that “Premature optimization is the root of all evil”.

However, quoting Albert Einstein:

“Everything should be made as simple as possible, but not simpler.”

Do not use premature optimization as an excuse to be lazy, or irresponsible about what you ship. Keep it in mind, but balance it up with your previous experience and feedback from others. There are times when you just know it is going to bite you soon.

Which leads us to the next item…

Productionization

I have learnt a lot of this over the past years running large production services (both at Locaweb and now at Heroku).

From the beginning, think about how your code is going to run in production. Do you understand the platform (runtime) it is going to run on? Are you confortable in troubleshooting hard problems? While we are on the topic, how are you planning to debug those problems?

Are you collecting metrics of how it is being used and how it services its requests? Even if you do not officially publish a SLA, understanding your service times is invaluable to dig into issues that will come with production load. Remember to track not only mean values, mean alone is useless. Always mix it up with variance, or rather with percentiles. Try to target for high percentiles – 95th and 99th are good targets, depending on what your scale is.

Last but not least: are you confident that you can notice (and be notified of) problems before your customers start complaining about your service on twitter? Metrics and automated monitoring are very important to run production services.

Testing

It does not really matter if automated or not. Yes, I said it. I have seen successful software (for whatever that means) both ways. Testing what you ship is just being responsible about it. And that includes having the proper infrastructure to do it: staging environments, gradual rollouts, feature flagging, etc.

Do not get me wrong. Automated testing should be usually preferred, but it is not the ultimate goal. I have seen many “evangelists” speaking hours about how automated testing is important, while the kernel of the operating system they use prefers a more traditional approach with lots of manual (or semi-automated) tests, Q/A testing teams and not many (or even zero) automated tests in its codebase.

Quick note about test (or behavior) driven development: it has more to do with software design methods than with the tests themselves. IMO it is a good practice, which works for me sometimes, but not always in every project, everyday.

Dependency Inversion

I am really proud of what some of my heavy Java development days have taught me.

Designing loosely coupled components is a big one. Dependency Injection, Dependency Inversion Principle and Inversion of Control are all related topics that to me mean a simple thing: design for single (or few) responsibilities.

It means that when you are writing that piece of code (be it a method, function, class, module, script or whatever), focus it on a single responsibility, and keep it small. If it gets to big, break it. If it needs resources to do its job, do not go after the resources it needs there, receive (inject) them instead.

In summary: instead of going after your dependencies, inject them. When a component needs to go after a resource it needs, it usually means that:

  1. The component is now responsible for its dependency lifecycle: this adds more responsibility. Now that it opens that damn connection to your DB (or your queue service), it needs to decide when to close, and needs to know how to do it. Also, what happens when you need to share the same connection in other parts of the system?
  2. The resource needs to be globally accessible: if the component did not create the resource it needs, or if that resource is shared in many parts of the codebase, it needs to be globally available. Hiding the resource creation behind factories does not help when the factory objects themselves need to be globally reachable. I hope I do not need to talk much about how globally accessible things are bad, but it is usually hard to test components using them, and it leads to tighter coupling. Changing that global resource gets harder with more stuff using it.

When you invert control and inject dependencies instead, you have the opportunity to centralize resource management in a single place. That single place centralizes the (single) responsibility of deciding where to inject that resource and how to share it. In more fancy environments, something like this can be called Dependency Injection Framework, but it does not need to be a big bloated framework once you understand the mechanics. In fact, this does not require a framework at all, it is just plain old method/function/constructor invocation with proper parameters.

All in all, let us please not forget about balance. This is not a rule, there are times when what you need is just a goddamn simple function.

Do Not Block The Event Loop

Evented programming (or event-driven programming) is very popular these days and one of the side effects is that some projects will want to be evented without careful consideration.

Going evented or not is one of the big decisions involved in writing code, with big implications. Things blocking the event loop are usually very hard to debug. When you go that route, all code called (including external libraries) needs to be aware of the event loop and not to block it.

There are many advantages though, notably it leads to much more lightweight servers which support much higher concurrency levels. When I am deciding if I should do event-driven code or not, here is what I consider (feel free to add a comment below with your own thoughts):

  • Is it an event-driven runtime/platform? Nodejs, for example, has been designed from the ground up to be evented. Meaning that all libraries and code written to run on it are already aware of the event loop. If the platform was not designed to be evented (like Ruby with EventMachine) much more care must be taken to not call code which will block the event loop. It is hard to control all the code in libraries included in your project. Take that into consideration.
  • Consider evented if the piece of code you are writing is mostly a data multiplexer, meaning that it just takes data from one side and sends it to another, acting like a pipe, distributor, load balancer, or router. This type of component is usually I/O bound.
  • Avoid evented if the piece of code you are writing is CPU bound and does a lot of processing. The chance it will block the event loop is much higher. I have seen projects having to resort to threads (or external processes) to move that part out of the event loop, often leading to spaghetti concurrency very hard to follow. Part evented, part threaded.

There Is Always Something To Learn

As I learn more, I expect this list to change, but I would say that it currently contains the factors influencing my development style the most. It was a very healthy exercise to think about what defines me as a programmer. I hope it is for you too, try it out!

3 thoughts on “Software Development Guidelines

  1. Great list, Kung!
    I’d add 2 items on it, were it my list:

    1) Naming: your code can be decoupled, well structured and so, but how are you going to identify the pieces if their names say nothing? It’s a very basic guideline, but it’s always good to keep it in mind. I’m always thinking on good names for my classes, methods and variables, as I think it will help other people (including me six months from now) understand that code I’m writing.

    2) Know your tools: I’ve seen quite often people reimplementing functionalities or misusing a library and blaming it later. If you are going to use a library (or tool), you need to know it well to take the best of it.

    I’d never thought about the event-loop thing, even having suffered with that. Thanks :)

    Keep writing!

    • Nice ones Luiz. Here is something worth quoting, related to the former:

      “There are only two hard things in Computer Science: cache invalidation and naming things.”

      – Phil Karlton

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s