- It uses a mix of ORM technologies
- It is multi-VM where it should be multi-threaded
- The current design requires distributed transactions whereas a better design might not
I’ve been working on some improvements for a legacy data loading application. It basically takes in a bunch of XML in different formats and updates a couple of separate databases with the transformed data. One database maintains a history of the loaded data, the other a current snapshot. It is essential that the entities referenced in these databases are consistent with one another.
Now, given that there are two databases that are both being updated and that some keys in database A are referenced in database B, I was expecting to see some kind of distributed transaction control. However, some sample test cases showed that this was not the case - a rollback on one database would not initiate a rollback on the other - leaving the databases in an inconsistent state.
A simple solution would be to use a JTA/XA transaction manager to provide a robust distributed transactions with 2 phase commits. I’ve use XA in the distant past where the transaction management is provided by a J2EE container such as WebLogic or JBoss, I seem to recall it being fairly straightforward - pretty much checking an ‘XA’ checkbox on each data source. However, the application in question has no such container and uses Spring instead - I was not keen to introduce a container just to fix this problem.
Fortunately there are a couple of open source JTA providers that are supposed to play nicely with Spring - the most visible being Atomikos TransactionEssentials and Bitronix BTM. These solutions are also compatible with Hibernate and iBatis - another requirement as both of these technologies are used.
Atomikos TransactionEssentials
TransactionEssentials seemed to be the most popular open source JTA provider, so I gave this a go first. Integrating it was relatively straightforward:- Create a couple of Atomikos beans and a property file
- Switch from the HibernateTransactionManager to the JtaTransationManager
- Set the data source implementations to the respective XA proxy provided by Atomikos
- Configure the database driver class to be an XA variant (org.postgresql.xa.PGXADataSource in my case)
- Provide some additional configuration to the Hibernate session factory.
Bitronix BTM
Bitronix also seemed fairly popular and looked to be reasonably actively maintained so I chose to try this out next. Integration was trivial this time around:- Swap a set of Atomikos Spring beans with a smaller number of Bitronix beans
- Change XA driver classes from Atomikos versions to the Bitronix equivalents
- Modify a few driver property names
Avoiding JNDI
Hibernate's default JTATransactionFactory implementation uses JNDI to lookup the UserTransaction. This might make sense for applications running in a container, but we have no need for JNDI in our Spring based configuration. Ultimately I can obtain the UserTransaction instance directly from the application context. To wire this into Hibernate I created a simple JTATransactionFactory that bypasses JNDI with a direct reference to the bean. This has to be instantiated and configured in the context and must be created before the SessionFactory.
Multiple processes using Bitronix on a single machine
Unfortunately, the legacy application in questions runs as multiple separate JVM processes on a single machine. The application really should have been implemented in a multi-threaded way but unfortunately for me - the current maintainer - it wasn’t and a big refactor is not feasible at this time. This multi-process pattern posed a problem for Bitronix as it has some default locations where it stores transaction logs and we don’t want more than one transaction manager updating a given log file. Fortunately, Bitronix can use alternative log name configurations and I was able to set a unique log file name per process:bitronix.tm.journal.disk.logPart1Filename=/var/run/myapp/${nodeName}-btm1.tlog
bitronix.tm.journal.disk.logPart2Filename=/var/run/myapp/${nodeName}-btm2.tlog
Additional database resources used by XA
The databases in question were both running in PostgreSQL instances. For the vendor XA driver to function the database server's max_prepared_transactions configuration option must be set to a suitable non-zero value (it is 0 by default). This configuration option only takes effect after a restart. In my case, for one of the database servers I’m using a restart is no small matter - so it just wasn’t practical to change this setting in the short-term. However, I still needed to get my fix out to improve the transactional behaviour of the application as soon as possible.I decided to forego a little robustness and use a Last Resource Commit optimization. This allows for one non-XA resource to participate in the distributed transaction. The transaction manager always commits transactions on this datasource last so that it can still rollback the XA datasources if the commit fails. Implementing the optimization was straightforward - I switched the XA datasource proxy implementation for the database in question to bitronix.tm.resource.jdbc.lrc.LrcXADataSource. With this in place I had everything working again.