Thursday, July 09, 2009

Modularizing Software with Ant/Ivy and Eclipse

This is one of the rare times when you get the chance to some technical cleanup and rewrite time from management, so we are trying to get some new things going that were pushed time and again in the past. In the architecture meetings we have been telling ourselves that we’d have to modularize better, so more unit tests could be written and overall quality be improved. Instead for lack of time we had to add new features to an already overloaded codebase and cross our fingers that nothing would go truly wrong in production. So far we have been very lucky.

Having some time now for cleanup and refactoring we quickly came to think about the build process – currently a large pile of Ant scripts, resembling a full blown software project in its own right regarding the complexity. I asked around on StackOverflow for experiences and best practices. I got some interesting answers, including recommendations for using OSGi and maven. I have some experience with OSGi from a previous project, and I think we will go for at least maintaining the relevant metadata in the MANIFEST.MF files we are creating. However as this is a chance for cleanup, not rewrite, getting everything in shape and run inside an OSGi container would both be too time consuming and hard to sell internally.

Currently the product is “organized” in about 45 Eclipse projects, totaling about 3 million lines of code, including a very small fraction generated classes. The real problem here is not really size, but complete and utter chaos concerning dependencies. When the project started out we were under severe time constraints and people took all sorts of shortcuts, not documenting which depends on what and why. Add duplication of functionality (“copy-paste-coding”) to the mix and there you go with a fine spaghetti dish…

But back to building the thing… The build process relies on a specific build order, but everything that gets compiled can “see” everything that came before it. This includes external libraries of course. A full build – and that’s the only thing possible – takes about an hour, excluding findbugs, checkstyle, tagging the repository etc. All in all about 3.5 hours from start to end.  And of course whenever something changes we have to compile the whole thing from start again – reason explained already.

We decided to go for some technical underpinnings to try and separate from each other for starters, as we can do this without needing to exchange and interact with business people. Once we gain some experience and have learned some lessons about what works and what does not we will try and go for the more juicy stuff. When looking at maven we quickly decided that we could not restructure our code in a way that would make it compliant with maven’s project layout. This and the fact that we would not realistically be able to follow the one-project-one-artifact rule maven forces on you (I hope this is still like that, did not check again). Ivy does not impose anything like that, a module (configuration) can consist of as many jar files, zips etc. as you like.

Moreover we also realized that in the beginning we will not be able to get completely rid of some custom logic that would be hard to implement within maven anyway. However we came across Ivy again – which a) we had looked at maybe two years ago but lost sight of again and b) has become an “official” Apache project – and decided to give it another try.

I have to say this was the most pleasing experience with an open source product I have had in a long time – maybe ever. The documentation is really extensive and very well written, despite the author’s self-criticism in terms of his English-skills:

An important thing to be able to use a tool is its amount of documentation. With Ivy, even if they are written in broken english (would you have preferred well written french :-)), the reference documentation is extensive and covers all the features including many examples. We also provide some official tutorials which are maintained with the new versions of Ivy. And since we consider documentation so important, we also provide online versions of documentation per Ivy version since Ivy 2.0.0-alpha2.

It is very readable and from my (German) point of view it sports a better writing style than many other manuals – if you get documentation at all. In fact trying out Ivy was a profoundly enlightening experience. Without much exaggeration, time and again when we came across a new aspect of the old system which we would have to find some way to redesign and re-implement in a new build, Ivy already presented a viable solution. This combined with an apparently very real-world approach to some problems (e.g. recommending not using a public repository for corporate products, something I always disliked anyway when reading about maven) really impressed us. (You can however use maven2 repositories if you really want to).

The concept of configurations is very useful: You can define a module and give it a name and define its version and then have several views of this module, called configurations. One configuration might be for production use, another one for testing. For both configurations you can define individual dependencies on other modules and their configurations. This combined with the very comprehensive Ant integration library the build.xml scripts of most reworked components simply consist of jar-ing and zip-ing compiled classes and source files together. Once done, we just call an ivy task inherited from a common build script that publishes the new module version first to a local repository for testing and optionally to a shared company repository. For simplicity we currently use a simple fileserver share, but there are other options available as well. All the rest is taken care of by Ivy: just pointing it to the repository it manages to fetch the required module artifacts from there to a local cache to build against. Conflict resolution is very configurable with the built in options already, and for the unlikely event that there is no suitable option available, you could just write a plugin yourself.

Maybe the best part however is the availability of a really useful and thought-through Eclipse plugin called IvyDE. As mentioned earlier our product is modeled and coded in Eclipse. In the past it had always been a chore to keep the poorly defined dependencies somewhat in sync between Ant scripts and Eclipse project definitions. The Ivy plugin solves this very elegantly: You configure the location of your repositories and common settings file (used by Ant and standalone Ivy tools) once and then just add an automatically maintained user-library to your project. If you later edit your ivy.xml file(s) – which include the dependencies and module properties for Ant anyway – upon save the user library gets updated automatically, including source and java doc attachments to allow for step-by-step debugging, source lookup etc. Without this it would have been really difficult to get acceptance with the other developers – they do not really care about too much of the “meta-stuff”. But I also personally have to admit, it makes things much easier to use, you just don’t have to give up any of the comfort you have gotten used to.

The plugin is still a beta release, but more in the Google sense of beta. There are some problems – error messages could sometimes be a little better and from time to time when you have syntax errors in our ivy.xml files the user library is not displayed in the Project Explorer view – but this stuff is really usable already!

We still have ways to go before we can really call our efforts even remotely complete, but I am very sure without Ivy it would be hell of lot more painful! Thank you Ivy!

No comments: