Welcome!

AJAX & REA Authors: John Funnell, Bob Little, Kevin Hoffman, Maureen O'Gara, Onkar Singh

Related Topics: Java

Java: Article

Java Feature — The Holy Grail of Database Independence

Part 3: When your application has to support more than one database type, it makes sense to find a tool that can create the sche

Populating the Data
Most applications will require that some of the database tables be initialized with data. HQ certainly does. The criteria for our data population is the same as schema creation, it has to be database-independent to support the various databases we support (or once supported). Hibernate doesn't have any tools to support this task. You can, of course, do this programmatically through the application using Hibernate POJOs. As a developer, you wouldn't have to use a separate tool or learn another configuration format. However, this is not the most straightforward approach, you'll probably write more code than you need. There are tools available for such things. DbUnit Framework will easily populate a database with an XML file containing the initialization data. As for us, we had rolled our data initialization functions into DBSetup.

We began building HQ in 2001; there was nothing available in open source. In fact, we even considered creating an open source project for DBSetup, but decided that we didn't have the bandwidth to maintain the project in the open source world. We defined the data in a database-independent format in XML files that are separate from the aforementioned schema XML files, and DBSetup would, of course, generate the SQL according to the database column types. An insertion into the SUBJECT table looks like this:

<table name="EAM_SUBJECT">
   <data id="1"
       name="admin"
       first_name="System"
       last_name="User"
       factive="TRUE"/>
</table>

Since we designed HQ as a conglomerate of subsystems (authorization, measurement, control, etc.), we organized the database initialization by having a schema XML file and a data XML file for each of the subsystems. After our conversion to Hibernate, we kept the data population function of DBSetup and switched over to hbm2ddl for the schema creation.

Upgrading the Database Schema
Hyperic has some loyal customers, some of whom date back to HQ 1.0 (thanks for sticking with us). However, this means that we have to carry forward the customers' data from some very different schemas from previous versions. Upgrading the schema is an area that is the least explored by existing tools, but represents one of the greatest challenges that we faced. hbm2ddl has an update option that attempts to create an upgrade script that calculates the delta between what's in the database and the mapping files. However, there's a fairly unsettling warning in Hibernate's tools documentation regarding this operation:

(Do *not* use against production databases, no guarantees at all that the proper delta can be generated or that the underlying database can actually execute the needed operations.)

In our development, we found that changes to existing columns and new indexes were not created by the update option, not to mention that it didn't support alteration of the stored data as a part of the migration. Clearly, this is inadequate for our purposes, since we have customers that rely on our software to manage their production machines. We had previously developed another database tool, also integrated with Ant, called DBUpgrader. Unfortunately, DBUpgrader isn't the most elegant of solutions. It supports some basic database functions, such as change column type, add new column, and data insertion. It's fairly cumbersome to maintain, because RDBMSes were different in their SQL variations, particularly for alterations and index creation. So there were a lot of direct SQL statements (we tag each statement for a specific database type as needed). After creating a fresh schema through hbm2ddl from our Hibernate mapping files, we saw that we had a lot of work to do to migrate our previous schema versions to match what Hibernate created. The process was pretty brute force, we simply dumped the new database schema to a file and used it as a reference to hand-code just about every new table, column, index, constraint, as well as transform and migrate the existing data and make column type changes. Perhaps there will be a better tool someday, but we stuck to a tool that worked for us, as cumbersome as it was (the entire upgrade XML file clocks in at about 5000 lines).

Container Managed Relationships (CMR)
Bidirectional CMRs are one of the very few instances where one might consider EJB2 superior to Hibernate. Take the simple example of the parent/child relationship. In EJB2, you can set the association from either side of the relationship. This means that you can either do child.setParent(parent) or parent.getChildren().add(child) in the application, and expect the framework to update the appropriate entities and automatically persist the foreign key relationship to the database. However, this is not the case with Hibernate. Hibernate doesn't have "true" bidirectional association. The issue is with the data integrity, and being able to create a data schema in the RDBMS with the correct constraints that mirror the object model. In this case, a child couldn't exist without a parent, so we want to put a non-null foreign key constraint on the PARENT column on the CHILD table in the database. However, this means that you'd be prevented from performing an operation like parent.getChildren().remove(child), because the child would be orphaned without a parent, which is against the model. For EJB2, it means you forgo that constraint, and actually allow nulls in the child's PARENT column. However, it made coding much easier. In Hibernate, you'd define one side of the relationship as read-only using the attribute inverse="true." Hibernate persists the association only through the non-inverse side of the relationship. What does that mean? If we were to define the inverse attribute in the parent's mapping (one-to-many), it means that the parent's children collection is read-only. Hibernate doesn't update the association through any modification to the parent's children collection.

Consider the following code:

Parent alice = new Parent();
Parent jane = new Parent();
Child mary = new Child();
mary.setParent(alice);
jane.getChildren().add(mary);

Who ends up being mary's parent? Since we defined the inverse attribute on the parent, it means that Hibernate persists the association only when we invoke setParent() on the child. In the end, mary's parent is alice. The effect of this is that we end up having to code the association on both sides to be safe. To delete a child, you have to make sure to delete the child POJO and from its parent, otherwise you'd get the dreaded "deleted object would be re-saved by cascade (remove deleted object from associations)" exception, because the parent object would still retain the child object in its collection and not remove it automatically. We certainly had to work through plenty of those problems before the application functioned correctly as before, since we became reliant on the flexibility of being able to associate entities from either side of the relationship. However, having a stricter set of constraints also guarantees that you don't end up with mysteriously orphaned rows in the database and a whole different set of problems (which we had experienced with EJB2).

Successful Quest?
Obviously, the end result is not a fully database-independent application. As of today, Hyperic HQ only supports Oracle and PostgreSQL backends. However, the migration to Hibernate improved portability on many fronts. There's also stronger mapping between Java objects and the database persistence. The tools aren't complete, and we are maintaining some tools that leave much to be desired. We'll soon put this recipe to the test, though. By the time you read this, we will probably have ported Hyperic HQ to run on additional databases. In my next article, we'll move into the application layer and examine how Hibernate queries the database and compare the merits between SQL, EJBQL, and HQL.

More Stories By Charles Lee

Charles Lee is co-founder and vice-president of engineering of Hyperic. Prior to co-founding Hyperic, Lee was a senior software engineer at Covalent. There, he built Covalent's configuration management product for Apache (CMP), and he spearheaded and architected the application management software (CAM). Before Covalent, Lee developed a document management system for retail store build-outs based on open-source technology at WiseConnect. Lee also held senior engineering position at Hewlett-Packard, where he was instrumental in developing print drivers for network LaserJets for the Asian market, as well as developing the UI framework used for LaserJets for all markets. Lee also developed the first GUI printer configuration framework for AutoCAD while a senior engineer at Autodesk. Lee was an early engineer at Backflip, where he created the document publishing system for the website based on mod_perl.

Lee received his BS in Computer Science and BA in Chemistry with honors from the University of Washington.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
Stan Martin 05/20/07 01:50:53 PM EDT

you should check out www.kitikat.com for an alternative to entity beans or pojos.