| Close Window |
The hope of using any persistence framework is absolute database independence. Database independence means that you can focus on your job as an application developer and not a DBA. However, no framework can fully make this claim. There's much more to running an application on a database than simply issuing compatible SQL queries and getting back the query results as expected. In my last article, I detailed the process by which we converted existing Enterprise Java Beans 2 (EJB2) Entity beans to Hibernate Plain Old Java Objects (POJOs). This article is less about our conversion process and more about the tools and methods we chose to work with for the Hibernate implementation and the backend databases (Oracle and PostgreSQL) supported by Hyperic HQ.
Creating the Database Schema
The relational database management system (RDBMS) is the foundation of any application. After all,
the point of having a persistence layer is to map from the object model to the data model. The creation of the database schema is the most common task, and certainly there are plenty of point tools for various types of RDBMS around. However, when your application has to support more than one database type, it makes sense to find a tool that can create the schema regardless of database type. EJB2 provided no native tools for such a task, so we built our own tool, plainly named DBSetup. DBSetup is integrated with Ant and can easily incorporate into our build or installer (both of which rely on Ant). Its architecture is straightforward. There are base classes called Table, Column, Index, View, etc., that know how to generate the SQL commands to create themselves. Most RDBMS have proprietary extensions in their Data Definition Language (DDL), which allow control over non-standard features of the database system. If a RDBMS requires non-standard commands, you just subclass the base class, for example, the OracleTable class that can return the correct SQL to create a table in Oracle. We defined our own XML file format for the database schema, a proprietary DDL, if you will. DBSetup generates the sequence of database-specific commands in a single script and piped it to the database to create the schema. For example, here's how we would define a table named SUBJECT:
<table name="SUBJECT">
<column name="ID"
default="sequence-only"
initial="10001"
primarykey="true"
required="true"
type="INTEGER"/>
<column name="NAME"
required="true"
size="100"
type="VARCHAR2"/>
<column name="FIRST_NAME"
required="false"
size="100"
type="VARCHAR2"/>
<column name="LAST_NAME"
required="false"
size="100"
type="VARCHAR2"/>
<column name="FSYSTEM"
type="BOOLEAN"
default="FALSE"/>
<index name="SUBJECT_NAME_KEY"
unique="true">
<field ref="NAME"/>
</index>
</table>
The syntax is self-explanatory. DBSetup worked fine for us, but it meant that for any new database type that we want to support, we'd have to analyze that database's command syntax and create subclasses as needed. As I had mentioned in my last article, we went through supporting Oracle, Pointbase, Cloudscape, InstantDB, MySQL, and PostgreSQL databases, and maintaining DBSetup to be compatible with all of them was tedious. Besides, there was no association between the tables being created here and the entity beans used by the application.
Hibernate provides a better tool for schema population, hbm2ddl. It is also integrated with Ant. It lets you run the task against your Hibernate mapping files (HBMs) and generate the resulting data definition language (DDL) in a file or to be exported directly into the database. Since we chose not to use the annotation feature with Hibernate, we hard-coded the HBM files. (Note that even if you were to use annotations, you can still use Hibernate tools to create the schema with your mapping.) We were able to convert our DBSetup schema files to HBM files relatively fast due to the structural similarities. Furthermore, we simplified our manually maintained Hibernate configuration files by offloading our defaults into an XSLT transformation process at build time, so our HBM files can be as minimal as possible.
Hibernate has classes that support various database dialects, so all of a sudden we've gained the ability to create schemas for a wide variety of databases without doing additional work ourselves. Hibernate's Web site lists the supported database types:
• Oracle 8i, 9i, 10g
• DB2 7.1, 7.2, 8.1
• Microsoft SQL Server 2000
• Sybase 12.5 (JConnect 5.5)
• MySQL 3.23, 4.0, 4.1, 5.0
• PostgreSQL 7.1.2, 7.2, 7.3, 7.4, 8.0, 8.1
• TimesTen 5.1, 6.0
• HypersonicSQL 1.61, 1.7.0, 1.7.2, 1.7.3, 1.8
• SAP DB 7.3
• InterSystems Cache' 2007.1
It's good to know that we have options.
Populating the Data
Most applications will require
that some of the database tables be initialized with data. HQ certainly
does. The criteria for our data population is the same as schema
creation, it has to be database-independent to support the various
databases we support (or once supported). Hibernate doesn't have any
tools to support this task. You can, of course, do this
programmatically through the application using Hibernate POJOs. As a
developer, you wouldn't have to use a separate tool or learn another
configuration format. However, this is not the most straightforward
approach, you'll probably write more code than you need. There are
tools available for such things. DbUnit Framework will easily populate
a database with an XML file containing the initialization data. As for
us, we had rolled our data initialization functions into DBSetup.
We began building HQ in 2001; there was nothing available in open source. In fact, we even considered creating an open source project for DBSetup, but decided that we didn't have the bandwidth to maintain the project in the open source world. We defined the data in a database-independent format in XML files that are separate from the aforementioned schema XML files, and DBSetup would, of course, generate the SQL according to the database column types. An insertion into the SUBJECT table looks like this:
<table name="EAM_SUBJECT">
<data id="1"
name="admin"
first_name="System"
last_name="User"
factive="TRUE"/>
</table>
Since we designed HQ as a conglomerate of subsystems (authorization, measurement, control, etc.), we organized the database initialization by having a schema XML file and a data XML file for each of the subsystems. After our conversion to Hibernate, we kept the data population function of DBSetup and switched over to hbm2ddl for the schema creation.
Upgrading the Database Schema
Hyperic has some
loyal customers, some of whom date back to HQ 1.0 (thanks for sticking
with us). However, this means that we have to carry forward the
customers' data from some very different schemas from previous
versions. Upgrading the schema is an area that is the least explored by
existing tools, but represents one of the greatest challenges that we
faced. hbm2ddl has an update option that attempts to create an upgrade
script that calculates the delta between what's in the database and the
mapping files. However, there's a fairly unsettling warning in
Hibernate's tools documentation regarding this operation:
(Do *not* use against production databases, no guarantees at all that the proper delta can be generated or that the underlying database can actually execute the needed operations.)
In our development, we found that changes to existing columns and new indexes were not created by the update option, not to mention that it didn't support alteration of the stored data as a part of the migration. Clearly, this is inadequate for our purposes, since we have customers that rely on our software to manage their production machines. We had previously developed another database tool, also integrated with Ant, called DBUpgrader. Unfortunately, DBUpgrader isn't the most elegant of solutions. It supports some basic database functions, such as change column type, add new column, and data insertion. It's fairly cumbersome to maintain, because RDBMSes were different in their SQL variations, particularly for alterations and index creation. So there were a lot of direct SQL statements (we tag each statement for a specific database type as needed). After creating a fresh schema through hbm2ddl from our Hibernate mapping files, we saw that we had a lot of work to do to migrate our previous schema versions to match what Hibernate created. The process was pretty brute force, we simply dumped the new database schema to a file and used it as a reference to hand-code just about every new table, column, index, constraint, as well as transform and migrate the existing data and make column type changes. Perhaps there will be a better tool someday, but we stuck to a tool that worked for us, as cumbersome as it was (the entire upgrade XML file clocks in at about 5000 lines).
Container Managed Relationships (CMR)
Bidirectional CMRs are one of the very few instances where one might
consider EJB2 superior to Hibernate. Take the simple example of the
parent/child relationship. In EJB2, you can set the association from
either side of the relationship. This means that you can either do
child.setParent(parent) or parent.getChildren().add(child) in the
application, and expect the framework to update the appropriate
entities and automatically persist the foreign key relationship to the
database. However, this is not the case with Hibernate. Hibernate
doesn't have "true" bidirectional association. The issue is with the
data integrity, and being able to create a data schema in the RDBMS
with the correct constraints that mirror the object model. In this
case, a child couldn't exist without a parent, so we want to put a
non-null foreign key constraint on the PARENT column on the CHILD
table in the database. However, this means that you'd be prevented from
performing an operation like parent.getChildren().remove(child),
because the child would be orphaned without a parent, which is against
the model. For EJB2, it means you forgo that constraint, and actually
allow nulls in the child's PARENT column. However, it made coding much
easier. In Hibernate, you'd define one side of the relationship as
read-only using the attribute inverse="true." Hibernate persists the
association only through the non-inverse side of the relationship. What
does that mean? If we were to define the inverse attribute in the
parent's mapping (one-to-many), it means that the parent's children
collection is read-only. Hibernate doesn't update the association
through any modification to the parent's children collection.
Consider the following code:
Parent alice = new Parent();
Parent jane = new Parent();
Child mary = new Child();
mary.setParent(alice);
jane.getChildren().add(mary);
Who ends up being mary's parent? Since we defined the inverse attribute on the parent, it means that Hibernate persists the association only when we invoke setParent() on the child. In the end, mary's parent is alice. The effect of this is that we end up having to code the association on both sides to be safe. To delete a child, you have to make sure to delete the child POJO and from its parent, otherwise you'd get the dreaded "deleted object would be re-saved by cascade (remove deleted object from associations)" exception, because the parent object would still retain the child object in its collection and not remove it automatically. We certainly had to work through plenty of those problems before the application functioned correctly as before, since we became reliant on the flexibility of being able to associate entities from either side of the relationship. However, having a stricter set of constraints also guarantees that you don't end up with mysteriously orphaned rows in the database and a whole different set of problems (which we had experienced with EJB2).
Successful Quest?
Obviously, the end result is not
a fully database-independent application. As of today, Hyperic HQ only
supports Oracle and PostgreSQL backends. However, the migration to
Hibernate improved portability on many fronts. There's also stronger
mapping between Java objects and the database persistence. The tools
aren't complete, and we are maintaining some tools that leave much to
be desired. We'll soon put this recipe to the test, though. By the time
you read this, we will probably have ported Hyperic HQ to run on
additional databases. In my next article, we'll move into the
application layer and examine how Hibernate queries the database and
compare the merits between SQL, EJBQL, and HQL.
© 2008 SYS-CON Media Inc.