Running Java Web Applications on Heroku Cedar Stack

Update: Heroku now does support Java.

Heroku does not officially support Java applications yet (yes, it does). However, the most recently launched stack comes with support for Clojure. Well, if Heroku does run Clojure code, it is certainly running a JVM. Then why can we not deploy regular Java code on top of it?

I was playing with it today and found a way to do that. I admit, so far it is just a hack, but it is working fine. The best (?) part is that it should allow any maven-based Java Web Application to be deployed fairly easy. So, if your project can be built with maven, i.e. mvn package generates a WAR package for you, then you are ready to run on the fantastic Heroku Platform as a Service.

Wait. There is more. In these polyglot times, we all know that being able to run Java code means that we are able to run basically anything on top of the JVM. I’ve got a simple Scala (Lift) Web Application running on Heroku, for example.

There are also examples of simple spring-roo and VRaptor web applications. I had a lot of fun coding on that stuff, which finally gave me an opportunity to put my hands on Clojure. There are even two new Leiningen plugins: lein-mvn and lein-herokujetty. 🙂

VRaptor on Heroku

Let me show you what I did with an example. Here is a step-by-step to deploy a VRaptor application on Heroku Celadon Cedar:

  1. Go to Heroku and create an account if you still do not have.
  2. We are going to need some tools. First Heroku gems:
    $ gem install heroku
    $ gem install foreman
    
  3. Then Leiningen, the clojure project automation tool. On my Mac, I installed it with brew:
    $ brew update
    $ brew install leiningen
    
  4. The vraptor-scaffold gem, to help us bootstrap a new project:
    $ gem install vraptor
    

That is it for the preparation. We may now start dealing with the project.

  1. First, we need to create the project skeleton:
    $ vraptor new <project-name> -b mvn
    $ lein new <project-name>
    $ cd <project-name>
    

    The lein command is not strictly necessary, but it helps with .gitignore and other minor stuff.

  2. Now, the secret sauce. This is the ugly code that actually tries to be smart and tricks Heroku to do additional build/compilation steps:
    $ wget https://gist.github.com/raw/1129069/f7ffcda9c42a3004fa8d3c496d4f13887c11ebf4/heroku_autobuild.clj -P src
    

    Or if you do not have wget installed:

    $ curl -L https://gist.github.com/raw/1129069/f7ffcda9c42a3004fa8d3c496d4f13887c11ebf4/heroku_autobuild.clj > src/heroku_autobuild.clj
    
  3. You also need to tweak the Leiningen project definition – project.clj. The template is here. Please remember to adjust your project name. It must be the same that you are using in the pom.xml. Or you may download it directly if you prefer:
    $ curl -L https://gist.github.com/raw/1129081/5790e173307d28a8df256e0aaa5a9fe7757e922f/project.clj > project.clj
    
  4. Unfortunately, Leiningen comes bundled with an old version of the Maven/Ant integration. The embedded maven is currently at version 2.0.8, which is old. The versions of some plugins configured in the default pom.xml from vraptor-scaffold are incompatible with this old version of maven.

    The best way to solve it for now is to remove all <version> tags from items inside the <plugins> section of your pom.xml. This is specific to VRaptor and other frameworks may need different modifications. See below for spring-roo and Lift instructions.

    If even after removing versions from all plugins, you still get ClassNotFoundException: org.apache.maven.toolchain.ToolchainManager errors, try cleaning your local maven repository (the default location is $HOME/.m2/repository). The pom.xml that I used is here.

  5. Now, try the same command that Heroku uses during its slug compilation phase. It should download all dependencies and package your application in a WAR file inside the target directory.
    $ lein deps
    

    Confirm that the WAR was built into target/. It must have the same name as defined in your project.clj.

  6. Create your Heroku/Foreman Procfile containing the web process definition:
    web: lein herokujetty
    

    And test with:

    $ foreman start
    

    A Jetty server should start and listen on a random port defined by foreman. Heroku does the same.

  7. Time to push your application to Heroku. Start editing your .gitignore: add target/ and tmp/, and please remove pom.xml. Important: The pom.xml must not be ignored.
  8. You may also remove some unused files (created by leiningen):
    $ rm -rf src/<project-name>
    # do not remove src/main src/test and src/heroku_autobuild.clj!
    
  9. Create a git repository:
    $ git init
    $ git add .
    $ git commit -m "initial import"
    
  10. Push everything to Heroku and (hopefully) watch it get built there!
    $ heroku create --stack cedar
    $ git push -u heroku master
    
  11. When it ends, open another terminal on the project directory, to follow logs:
    $ heroku logs -t
    
  12. Open the URL that heroku creates and test the application. You may also spawn more instances and use everything Heroku has to offer:
    $ heroku ps:scale web=2
    $ heroku ps:scale web=4
    $ heroku ps:scale web=0
    
  13. Quick tip: you do not have SSH access directly to Dynos, but this little trick is very useful to troubleshoot issues and to see how slugs were compiled:
    $ heroku run bash
    

spring-roo on Heroku

Once you understand the process for a VRaptor application, it should be easy for others as well. Just follow the simple step-by-step here, to create a simple spring-roo maven application.

Then, before running lein deps or foreman start, you must adjust your pom.xml, to remove plugins incompatible with the older mvn that is bundled in Leiningen. Here is the pom.xml that I am using (with some of the plugins downgraded).

You can see my spring-roo application running on Heroku here.

Scala/Lift on Heroku

The same for Scala/Lift: follow the instructions to bootstrap an application with maven. One simple change to the pom.xml is required. My version is here.

You also need to turn off the AUTO_SERVER feature of H2 DB. It makes H2 automatically starts a server process, which binds on a tcp port. Heroku only allows one port bound per process/dyno, and for the web process, it must be the jetty port.

To turn it off, change the database URL inside src/main/scala/bootstrap/liftweb/Boot.scala. I changed mine to a in-memory database. The URL must be something similar to jdbc:h2:mem:<project>.db.

My Lift application is running on Heroku here.

I hope it helps. Official support for Java must be in Heroku’s plans, but even without it yet, we’ve got a lot of possibilities.

Advertisement

DSLs: one interface, multiple implementations

One interface, multiple implementations is one of these design concepts I like most. It’s basically polymorphism in its pure form!

Coding a little bit a while ago, I realized how DSLs could reinforce this idea. My example here is a simple implementation for queue consumers, as a simple Ruby internal DSL:

class GitRepositoryCloner < Consumer
  queue "RepositoriesToBeCloned"
  exclusive true

  handle do |message|
    # git clone repository!
    # I'm pretty sure github has something alike
  end
end

And to enqueue a message, we could do:

GitRepositoryCloner.publish("rails")

GitRepositoryCloner is a simple consumer of the queue named RepositoriesToBeCloned. One of the times I did something similar to this, I needed support for exclusive consumers. Then, my choices were ActiveMQ as the messaging middleware together with the Stomp protocol.

Using them as an example, let’s take a look on a possible implementation for the Consumer class, using the stomp gem:

module ActiveMQ
  class Consumer

    def self.queue(name)
      @queue_name = name
    end

    def self.exclusive(bool)
      @exclusive = bool
    end

    def self.handle(&blk)
      @callback = blk
    end

    def self.listen
      broker = Stomp::Client.new(Config[:broker])
      broker.subscribe(@queue_name, 
          :'activemq.exclusive' => @exclusive) do |message|
        @callback.call(message)
        broker.acknowledge(message)
      end
    end

    def self.publish(message)
      broker = Stomp::Client.new(Config[:broker])
      broker.publish(@queue_name, message, :persistent => true)
    end

  end
end

Consumer = ActiveMQ::Consumer

The last line is where we choose the ActiveMQ::Consumer as the default implementation.

The beautiful aspect of this little internal DSL, composed only of three methods (queue, exclusive and handle), is that it defines an interface. Here, I have seen a common misconception from many developers coming from Java, C# and similar languages which have the interface construct. An interface in Object Orientation is composed of all accessible methods of an object. In other words, the interface is the object’s face to the rest of the world. We are not necessarily talking about a Java or C# interface construct.

In this sense, these three methods (queue, exclusive and handle) are the interface of the Consumer internal DSL (or class object, as you wish).

Let’s say for some reason, we would like to switch our messaging infrastructure to something else, like Resque, which Github uses and is awesome. Resque’s documentation says that things are a little bit different for Resque consumers. They must define a @queue class attribute and must have a perform class method.

As we would do with regular Java/C# interfaces, let’s make another implementation respecting the previous contract:

module Resque
  class Consumer

    def self.queue(name)
      @queue = name
    end

    def self.exclusive(bool)
      self.extend(Resque::Plugins::LockTimeout) if bool
    end

    def self.handle(&blk)
      self.send(:define_method, :perform, &@blk)
    end

    def self.listen
      raise "Not ready yet" unless self.respond_to?(:perform)
    end

    def self.publish(message)
      Resque.enqueue(self, message)
    end

  end
end

There you can see how the implementations differ. The exclusive consumer feature is provided by the LockTimeout plugin. In this case, instead of passing the activemq.exclusive parameter to the connection, we must use the Resque::Plugins::LockTimeout module, as the documentation says. Another key difference is in the message handling process. Instead of passing a handler block to the subscribe method, Resque consumers are required to define a perform method, which we are dynamically creating with some metaprogramming: Class#define_method(name).

Finally, here is how we switch our messaging backend to Resque, without any changes to the consumer classes (GitRepositoryCloner in this example):

Consumer = Resque::Consumer

That’s it: one interface, two (multiple) implementations.

Ruby and dependency injection in a dynamic world

It’s been many years I’ve been teaching Java and advocating dependency injection. It makes me design more loosely coupled modules/classes and generally leads to more extensible code.

But, while programming in Ruby and other dynamic languages, the different mindset always intrigued me. Why the reasons that made me love dependency injection in Java and C# don’t seem to apply to dynamic languages? Why am I not using DI frameworks (or writing simple wiring code as I did many times before) in my Ruby projects?

I’ve been thinking about it for a long time and I don’t believe I’m alone.

DI frameworks are unnecessary. In more rigid environments, they have value. In agile environments like Ruby, not so much. The patterns themselves may still be applicable, but beware of falling into the trap of thinking you need a special tool for everything. Ruby is Play-Doh, remember! Let’s keep it that way.

— Jamis Buck

Even being a huge fan of DI frameworks in the Java/C# world (particularly the lightweights), I completely agree with Jamis Buck (read his post, it’s very good!). This is a recurrent subject in talks with really smart programmers I know and I’ve tried to explain my point many times. I guess it’s time to write it down.

The whole point about Dependency Injection is that it is one of the best ways to achieve Inversion of Control for object dependencies. Take the Carpenter object: his responsibility is to make and repair wooden structures.

class Carpenter {
    public void repair(WoodenStructure structure) {
        // ...
    }
    public WoodenStructure make(Specification spec) {
        // ...
    }
}

Because of his woodwork, the carpenter often needs a saw (dependency). Everytime the carpenter needs the saw, he may go after it (objects who take care of their dependencies on their own).

class Carpenter {
    public void repair(WoodenStructure structure) {
        PowerSource powerSource = findPowerSourceInsideRoom(this.myRoom);
        Saw saw = new ElectricalSaw(powerSource);

        // ok, now I can *finally* do my real job...
    }
}

The problem is that saws could be complicated to get, to assemble, or even to build. The saw itself may have some dependencies (PowerSource), which in turn may also have some dependencies… Everyone who needs a saw must know where to find it, and potencially how to assemble it. What if saw technology changes and some would rather to use hydraulic saws? Every object who needs saws must then be changed.

When some change requires you to mess with code everywhere, clearly it’s a smell.

Have you also noticed that the carpenter has spent a lot of time doing stuff which isn’t his actual job? Finding power sources and assembling saws aren’t his responsibilities. His job is to repair wooden structures! (Separation of Concerns)

One of the possible solutions is the Inversion of Control principle. Instead of going after their dependencies, objects receive them somehow. Dependency Injection is the most common way:

class Carpenter {
    private Saw saw;
    public Carpenter(Saw saw) {
        this.saw = saw;
    }
    public void repair(WoodenStructure structure) {
        // I can focus on my job!
    }
}

Now, the code is more testable. In unit tests where you don’t care about testing saws, but only about testing the carpenter, a mock of the saw can be provided (injected in) to the carpenter.

In the application code, you can also centralize and isolate code which decides what saw implementation to use. In other words, you now may be able to give the responsibility to do wiring to someone else (and only to him), so that objects can focus on their actual responsibilities (again Separation of Concerns). DI framework configuration is a good example of wiring centralized somewhere. If the saw implementation changes somehow you have just one place to modify (Factories help but don’t actually solve the problem – I’ll leave this discussion for another post).

class GuyWithResponsibilityToDoWiring {
    public Saw wireSaw() {
        PowerSource powerSource = getPowerSource();
        return new Saw(powerSource);
    }
    public PowerSource getPowerSource() { ... }
}

Ok. Plain Old Java Dependency Injection so far. But what about Ruby?

Sure you might do dependency injection in Ruby (and its dynamic friends) exactly the same way as we did in Java. But, it took some time for me to realize that there are other ways of doing Inversion of Control in Ruby. That’s why I never needed a DI framework.

First, let’s use a more realistic example: repositories that need database connections:

class Repository {
    private Connection connection;
    public Repository(Connection connection) {
        this.connection = connection;
    }
    public Something find(Integer id) {
        return this.connection.execute("SELECT ...");
    }
}

Our problem is that many places need a database connection. Then, we inject the database connection as a dependency in everywhere it is needed. This way, we avoid replicating database-connection-retrieval code in such places and we may keep the wiring code centralized somewhere.

In Ruby, there is another way to isolate and centralize common behavior which is needed in many other places. Modules!

module ConnectionProvider
  def connection
    # open a database connection and return it
  end
end

This module can now be mixed, or injected in other classes, providing them its functionality (wiring). The “module injection” process is to some degree similar to what we did with dependency injection, because when a dependency is injected, it is providing some functionality to the class too.

Together with Ruby open classes, we may be able to centralize/isolate the injection of this module (perhaps even in the same file it is defined):

# connection_provider.rb

module ConnectionProvider
  def connection
    # open a database connection and return it
  end
end

# reopening the class to mix the module in
class Repository
  include ConnectionProvider
end

The effect inside the Repository class is very similar to what we did with dependency injection before. Repositories are able to simply use database connections without worrying about how to get or to build them.

Now, the method that provides database connections exists because the ConnectionProvider module was mixed in this class. This is the same as if the dependency was injected (compare with the Java code for this Repository class):

# repository.rb

class Repository
  def find(id)
    connection.execute("SELECT ...")
  end
end

The code is also very testable. This is due to the fact that Ruby is a dynamic language and the connection method of Repository objects can be overridden anywhere. Particularly inside tests or specs, the connection method can be easily overridden to return a mock of the database connection.

I see it as a different way of doing Inversion of Control. Of course it has its pros and cons. I can tell that it’s simpler (modules are a feature of the Ruby language), but it may give you headaches if you have a multithreaded application and must use different implementations of database connections in different places/threads, i.e. to inject different implementations of the dependency, depending on where and when it’s being injected.

Opening classes and including modules inside them is a global operation and isn’t thread safe (think of many threads trying to include different versions of the module inside the class). I’m not seeing this issue out there because most of Ruby applications aren’t multithreaded and run on many processes instead; mainly because of Rails.

Regular dependency injection still might have its place in Ruby for these cases where plain modules aren’t enough.

IMO, it’s just much harder to justify. In my experience modules and metaprogramming are usually just enough.

Ruby indentation for access modifiers and their sections

Ruby programmers are very flexible (and permissive) when talking about Ruby code indentation. Most of rubyists I know prefer to indent with two spaces instead of tabs (soft tabs in some editors). There are even some style guides and code conventions published for the language, but none of them is official and they talk too little about code indentation practices.

Things go even worse when we have to choose how to indent private, protected and public sections of classes. Programmers and projects I’ve seen so far seem to adopt different styles. I’ll try to summarize them here:

1. Indent as a new block

This is the most common pattern I see out there. Some programmers prefer to treat sections created by access modifiers as new blocks and indent them:

class MyClass

  def the_public_method
    # ...
  end

  private

    def first_private_method
      # ...
    end

    def second_private_method
      # ...
    end

  protected

    def the_protected_method
      # ...
    end

end

Pros:

  • Visibility: easy to see if a method has a non-standard access modifier (non-public in most cases).
  • Produces a very readable code.
  • Used in the rails codebase (mainly older code).

Cons:

  • Semantically wrong: access modifiers do not create new scopes. They are simple method calls. Deeper explanation in the next item.
  • Methods inside the same scope with different indentation levels.
  • (opinion) One more level of indentation. When you have classes inside modules it starts being a problem: “indentation hell” :-). Your code ends up using its 80 columns very fast and you start having to break lines more often.

2. No indentation

Ruby access modifiers are simple method calls. The Module class (from which Ruby classes inherit) has the private, protected and public methods, that when called with no arguments, simply change the visibility of subsequent defined methods. You may confirm this by testing that the following code works:

class MyClass

  self.send(:private)

  def the_private_method
    # ...
  end

  def another_private_method
    # ...
  end

end

Try to call any of these methods in an instance of the MyClass class and you will confirm they are private.

Because access modifiers are simple method calls, they don’t create a new scope. Semantically speaking, they shouldn’t create another level of indentation. This is even advocated by one of the most important Ruby style guides.

class MyClass

  def the_public_method
    # ...
  end

  private

  def first_private_method
    # ...
  end

  def second_private_method
    # ...
  end

  protected

  def the_protected_method
    # ...
  end

end

Because of its correctness, this was my preferred style until recently, when I found the next ones.

Pros:

  • It is semantically correct: method calls don’t create new scopes, then method calls shouldn’t increase indentation levels.
  • (opinion) this style makes access modifiers look like python decorators, Java annotations and C# attributes. IMO, it’s one of the nice things about Ruby: its ability to reproduce most of other languages features, without requiring new syntax or fancy constructs. It’s a simple method call.

Cons:

  • Hard to see if a method has any non-standard access modifier. In classes with many methods (code smell!) you must constantly scroll to see what access modifier is applied.
  • Some would argue that this could be solved with proper syntax highlighting or visual method decoration. But it requires editor/IDE support, then I’m considering it as a disadvantage.

3. if-else style

We have a similar case in Ruby: the if keyword creates a new block and has associated else and elsif statements, which are also new blocks. The convention suggests the following indentation (note that if, elsif and else are in the same indentation level):

if something?
  2.times { play }
  jump(2.meters)
elsif other?
  sing and dance
else
  cry
  walk
end

Some of the code recently pushed to Rails 3.0 use the same indentation style for classes with access modifiers. I liked it:

class MyClass

  def the_public_method
    # ...
  end

private

  def first_private_method
    # ...
  end

  def second_private_method
    # ...
  end

protected

  def the_protected_method
    # ...
  end

end

Pros:

  • Easy to see access modifiers inside classes. With a fast look it is easy to identify sections of private, public and protected methods. They look like blocks.
  • Readable code. Similar to other Ruby constructs.
  • Used in the rails codebase (mainly newer code).

Cons:

  • Still semantically wrong. Access modifiers are inside the scope created by the class keyword. They should be indented.
  • if, elsif and else are keywords and have special meaning. Access modifiers aren’t and should follow the same convention of other method calls.

4. Indentation with one space

C++ also splits private, protected and public methods in sections. The construct is very similar to Ruby:

class MyClass {
 public:
  MyClass();
  void ThePublicMethod(int n);
  void Print(ostream &output) const;

 private:
  int *Items;
  bool FirstPrivateMethod();
  int SecondPrivateMethod();
};

While reading Google’s style guide for C++ code, their recommendation for public, private and protected sections indentation caught my attention. They suggest that you indent the private, protected and public keywords with only one space.

It seemed a bit awkward in the beginning. But soon, I started liking it because it makes sections easy to identify, access modifiers remain indented inside classes and methods stay all in the same indentation level, as they are all in the same scope.

class MyClass

  def the_public_method
    # ...
  end

 private

  def first_private_method
    # ...
  end

  def second_private_method
    # ...
  end

 protected

  def the_protected_method
    # ...
  end

end

Pros:

  • Easy to see access modifiers and sections inside classes.
  • Semantically correct.
  • All methods remain in the same indentation level.
  • Google style guide.

Cons:

  • People are always fighting for 2 spaces vs 4 spaces vs tabs indentation. One space? Very uncommon.
  • Special treatment for regular method calls. Statements inside the same scope with different indentation levels

My choice

Currently, I tend to like the 3rd (new Rails style) and 4th (Google) styles more. I somehow feel they give the best deal, considering their advantages and disadvantages. Being able to easily identify sections and access modifiers inside classes is very important to me.

In my current project, the team adopted the Google style. I’m happy with it, but you must be flexible in the beginning to adopt one-space-indentation. I won’t lie, it feels strange until you get used. Another issue I have is that my TextMate indent Ruby code with two spaces and treat them as one “step” when navigating with keyboard cursors. Then, I have to type a few more keystrokes to indent access modifiers the way I want. It’s hard to explain, you will have to try it yourself.

And you, what’s your opinion?
How do you indent access modifiers and their sections inside classes?

Rfactor: Ruby Refactoring for your loved editor

I know we all love Ruby, and doesn’t care that much about not having auto completion/IntelliSense available.

I don’t care that much about auto completion, when coding in Ruby, myself. What I really like in Java IDEs is their refactoring support. Eclipse and IntelliJ IDEA are simply awesome in this space for Java. We still have ReSharper for Visual Studio and others, targeting other languages. Ruby has NetBeans, Aptana RadRails, RubyMine and TurboRuby/3rdRail doing a great job in this area.

But, I have this feeling that most of Ruby developers do not use IDEs (including myself). We are using good text editors, such as TextMate, Vim, Emacs and GEdit. They are good enough. Why would I need something else?

I have to admit. I really miss some refactorings while programming in Ruby. Particularly, the lack of “Extract Method” and “Extract Variable” bothers me. They aren’t even complicated, why hasn’t someone already implemented them?

So, I would like to introduce Rfactor. It is a Ruby gem, which aims to provide common and simple refactorings for Ruby code. RubyParser from Ryan Davis is being used to analyze and manipulate the source code AST, in the form of Sexps.

In theory, we should be able to use Rfactor to power any editor, adding refactoring capabilities to it. I’m targeting TextMate, but I would love to see contributions for others. The TextMate Bundle is hosted on github:

Rfactor TextMate Bundle, with installation instructions

This very first release has support only for basic “Extract Method”: inside methods and without trying to guess the method parameters and return.

Stay in touch, there is much more coming!