Jan 17, 2013

Distributed transactions with Spring

Warning: The application design I describe below is not one that I'd recommend. This post concerns slight improvements to a monster of a legacy project that has some technical debt. Specifically I'm not at all happy with the following:
  • It uses a mix of ORM technologies
  • It is multi-VM where it should be multi-threaded
  • The current design requires distributed transactions whereas a better design might not
End of warning.

I’ve been working on some improvements for a legacy data loading application. It basically takes in a bunch of XML in different formats and updates a couple of separate databases with the transformed data. One database maintains a history of the loaded data, the other a current snapshot. It is essential that the entities referenced in these databases are consistent with one another.

Now, given that there are two databases that are both being updated and that some keys in database A are referenced in database B, I was expecting to see some kind of distributed transaction control. However, some sample test cases showed that this was not the case - a rollback on one database would not initiate a rollback on the other - leaving the databases in an inconsistent state.

A simple solution would be to use a JTA/XA transaction manager to provide a robust distributed transactions with 2 phase commits. I’ve use XA in the distant past where the transaction management is provided by a J2EE container such as WebLogic or JBoss, I seem to recall it being fairly straightforward - pretty much checking an ‘XA’ checkbox on each data source. However, the application in question has no such container and uses Spring instead - I was not keen to introduce a container just to fix this problem.

Fortunately there are a couple of open source JTA providers that are supposed to play nicely with Spring - the most visible being Atomikos TransactionEssentials and Bitronix BTM. These solutions are also compatible with Hibernate and iBatis - another requirement as both of these technologies are used.

Atomikos TransactionEssentials

TransactionEssentials seemed to be the most popular open source JTA provider, so I gave this a go first. Integrating it was relatively straightforward:
  • Create a couple of Atomikos beans and a property file
  • Switch from the HibernateTransactionManager to the JtaTransationManager
  • Set the data source implementations to the respective XA proxy provided by Atomikos
  • Configure the database driver class to be an XA variant (org.postgresql.xa.PGXADataSource in my case)
  • Provide some additional configuration to the Hibernate session factory.
With only these Spring configuration changes I was able to get XA transactions working. The test cases that had previously failed now passed. However, I ran into some annoying problems with our integration test suite. It looked as though the Atomikos datasources registered in one test class were still present when running other test classes. This caused a frequent ‘Another resource already exists with name’ error from Atomikos. Initially I suspected this could be a context caching issue with the SpringJUnit4ClassRunner. However, after carefully applying @DirtiesContext directives to the relevant tests, and changing the Maven SureFire fork mode to ‘always’ the problem persisted. After spending some time trying to get the tests to cooperate I decided to try another JTA provider as I suspected that Atomikos was responsible for the persistence of the data source configuration between tests and that ultimately this was the root of my problem. Also, given that the integration of Atomikos was relatively smooth - I expected that a transition to another JTA provider would be even easier.

Bitronix BTM

Bitronix also seemed fairly popular and looked to be reasonably actively maintained so I chose to try this out next. Integration was trivial this time around:
  • Swap a set of Atomikos Spring beans with a smaller number of Bitronix beans
  • Change XA driver classes from Atomikos versions to the Bitronix equivalents
  • Modify a few driver property names
For those who are interested - I’ve made my Spring/Hibernate/Bitronix configuration available. When I fired up my test suite this time around everything worked perfectly so the use of Bitronix BTM seemed to give a quick win - however, there were some problems were waiting for me when deploying and running the full application.

Avoiding JNDI

Hibernate's default JTATransactionFactory implementation uses JNDI to lookup the UserTransaction. This might make sense for applications running in a container, but we have no need for JNDI in our Spring based configuration. Ultimately I can obtain the UserTransaction instance directly from the application context. To wire this into Hibernate I created a simple JTATransactionFactory that bypasses JNDI with a direct reference to the bean. This has to be instantiated and configured in the context and must be created before the SessionFactory.

Multiple processes using Bitronix on a single machine

Unfortunately, the legacy application in questions runs as multiple separate JVM processes on a single machine. The application really should have been implemented in a multi-threaded way but unfortunately for me - the current maintainer - it wasn’t and a big refactor is not feasible at this time. This multi-process pattern posed a problem for Bitronix as it has some default locations where it stores transaction logs and we don’t want more than one transaction manager updating a given log file. Fortunately, Bitronix can use alternative log name configurations and I was able to set a unique log file name per process:

bitronix.tm.journal.disk.logPart1Filename=/var/run/myapp/${nodeName}-btm1.tlog
bitronix.tm.journal.disk.logPart2Filename=/var/run/myapp/${nodeName}-btm2.tlog

Additional database resources used by XA

The databases in question were both running in PostgreSQL instances. For the vendor XA driver to function the database server's max_prepared_transactions configuration option must be set to a suitable non-zero value (it is 0 by default). This configuration option only takes effect after a restart. In my case, for one of the database servers I’m using a restart is no small matter - so it just wasn’t practical to change this setting in the short-term. However, I still needed to get my fix out to improve the transactional behaviour of the application as soon as possible.

I decided to forego a little robustness and use a Last Resource Commit optimization. This allows for one non-XA resource to participate in the distributed transaction. The transaction manager always commits transactions on this datasource last so that it can still rollback the XA datasources if the commit fails. Implementing the optimization was straightforward - I switched the XA datasource proxy implementation for the database in question to bitronix.tm.resource.jdbc.lrc.LrcXADataSource. With this in place I had everything working again.

Summary

Note that although the integration of JTA/XA can be fairly straightforward, it is essential that you have test cases to verify the expected transactional behaviour of your application. Do not assume that it’ll just work.