Data Access support in builtin CMP engine

Note: There are inaccuracies still in these FAQs. Work in progress ….

What are the types of container managed fields that can be persisted by the Inprise CMP engine?

You may look at the “data” EJB example. Inprise EJB Container CMP supports all the types in the JDBC specification, and then a couple more. These include:
Basic types:
boolean/Boolean
byte/Byte
char/Character 
double/Double
float/Float
int/Integer
long/Long
short/Short
String
BigDecimal
byte[]
java.sql.Date/java.sql.Time/java.sql.Timestamp
java.util.Date

Complex Types:
Any class implementing java.io.Serializable, e.g. Vector, Hashtable
Other entity bean references
Fields that are JavaBeans
Note that Hashtable, Vector, etc. are supported by virtue of the fact that it implements java.io.Serializable. Many other datatypes (Java collections, 3rd party collections) are probably supported for the same reason. However, note that these are stored by serializing their state and storing it into a BLOB. We are not mapping a Java collection in any "smart" way, just storing their binary format. Essentially the following rule is observed by the CMP engine: if the type is not one of the explicitly supported types (e.g., not one of the types as mentioned in the list, previously), the type will be serialized as a BLOB.

A BLOB in this context is whatever LONGVARBINARY maps to. (As per the JDBC spec) In Oracle, this is LONG RAW.

What about the performance impact of storing as BLOBs?

The concerns of course run to performance. We provide the ability to cache CMP objects to avoid too much BLOBing & updates. Of course, caching can only be done if the EJB container has exclusive access to the given table.

How about "half-complex" types such as a class containing only simple types (boolean, int, ...) e.g. a JavaBean? Are they also stored as BLOBS?

JavaBeans are handled elegantly. We do not have a sophisticated mapping for any arbitrary Seriallizable class. You can do pretty good mapping from JavaBean attributes to table columns.

Do you support database specific abstract types using setObject/getObject?

These accessors are notoriously vendor specific. What we are doing is for instances which are java.io.Serializable, storing these as a byte object serialization in a LONGVARBINARY. This is pretty much the same thing as calling setObject/getObject for a database that knows how to serailize an arbitrary Java object, but has the advantage that it works for JDBC drivers which don't support this behavior as well.

Basically, setObject and getObject are calls on a PreparedStatement and ResultSet, respectively, which allow the user to use java.lang.Objects in the SQL statement. The JDBC documentation contains the following statement: "Note that this method may be used to pass database specific abstract data types, by using a Driver specific Java type." Thus, some databases may support Java types which go beyond the primitive SQL data types. In particular, some databases allow any serialized Java object to be put into or retrieved.

We do not rely on this driver specific behavior, but instead use Java serialization to convert the Java object to/from a byte array, and use (LONG)VARBINARY type to store the data. This is more portable, but may be slower than using setObject/getObject directly. (Actually, we doubt it is slower, since presumably most JDBC drivers simply do the serialization themselves under the covers.)

How do you map char/Character types?

You will notice that these data types are missing from the JDBC specification, in that there is no setChar/getChar methods. Many jdbc drivers have filled this hole in their own proprietary ways. Our code is based on generic JDBC, and we had to improvise for this missing data type. We support char/Character using setString/getString.

How does the CMP engine know the username, password, etc to use for opening a database connection?

For every entity one should specify a DataSource that the CMP engine will use to access the database via JDBC. Define a DataSource in the beans's XML DD and refer to that using a resource-ref.

In the vendor specific XML file where you provide the jndi binding for the resource-ref, add the element

<cmp-resource>True</cmp-resource>. 

We have a user friendly feature where if the entity declares only one resource-ref we automatically choose that as the cmp-resource without the user having to explicitly state so using the above XML element.

When does the Container automatically create tables?

The Inprise Container can automatically create tables for container-managed entities based on the entity's container-managed fields. It is intended as an “ease of use” features where the database interaction for CMP entity beans are completely taken care of by the Container. However table creation and datatype mappings vary from vendor to vendor. One can think that we can reverse engineer the database dialect from the JDBC URL. It has been found to be error prone and dicey. So we ask that the user specify the dialect and then we take it from there. Also, automatically a table is only created if the JDBC dialect is specified. This means that the user has to explicitly set the dialect to cause tables to be created (and thus the container cannot get it wrong).

The only exception is with DataStore, where it is easy to detect the URL/dialect, and where, since the DB runs in-process, it is difficult to to create the tables otherwise.

So for DataStore, we always try to create the tables. For other dialects, we only try if the dialect is specified in the deployment descriptor.

What are the “dialect” values?

The names used for dialects (ignoring case) are:
    "none",
    "jdatastore",
    "oracle",
    "sybase",
    "mssqlserver",
    "db2",
    "interbase",
    "informix",

Are PreparedStatements reused to enhance performance?

Yes. We reuse them across transactions. Some databases may not like that so we provide a way to turn it off by setting a property in the DataSource.
reuseStatements = false/true
        Set statement reuse across transctions to be on or off.
        On by default, except for "datastore" and "db2" dialects

How do you handle database specific variations in the correspondence between Java types and SQL types?

The Inprise Container can automatically create tables for container-managed entities based on the entity's container-managed fields. To do this, we use the mapping from Java type to SQL type. However, the names of SQL types differs from one database to another.

For example, the canonical mapping, as per the JDBC specification is:

      Boolean.class              BIT
      Byte.class                 TINYINT
      Double.class               DOUBLE
      Float.class                REAL
      Integer.class              INTEGER
      Long.class                 BIGINT
      Short.class                SMALLINT
      String.class               VARCHAR
      java.math.BigDecimal.class NUMERIC
      byte[].class               VARBINARY
      java.sql.Date.class        DATE
      java.sql.Time.class        TIME
      java.sql.Timestamp.class   TIMESTAMP
      java.util.Date.class       TIMESTAMP
      java.io.Serializable.class VARBINARY

However, for JDataStore, this has to be modified as follows:

    Boolean.class              BOOLEAN
    Byte.class                 SMALLINT
    Float.class                FLOAT
    Long.class                 LONG
    byte[].class               OBJECT
    java.io.Serializable.class OBJECT
E.g., DataStore does not have a BIT type, but instead uses type BOOLEAN.

Similarly, for Oracle, all the various numeric types map to NUMBER. Also, Oracle does not support a VARCHAR without a specific size, so String maps to VARCHAR(100).

We have acknowledged this and the Container has builtin knowledge of vendor specific mappings. We currently support and tested against the following databases: Oracle, Sybase, Informix, InterBase, JdataStore, Microsoft SQL Server, DB2. The user need only specify the database dialect as part of the CMP datasource information in the Vendor specific XML DD.

We use the following table to do type mapping:

{ // DEFAULT dialect...
        "BIT",			// boolean
        "TINYINT",		// byte
        "CHAR(1)",		// char
        "DOUBLE",		// double
        "REAL",			// float
        "INTEGER",		// int
        "BIGINT",		// long
        "SMALLINT",		// short
        "VARCHAR",		// String
        "NUMERIC",		// java.math.BigDecimal
        "VARBINARY",		// byte[]
        "DATE",			// java.sql.Date
        "TIME",			// java.sql.Time
        "TIMESTAMP",		// java.sql.Timestamp
        "TIMESTAMP",		// java.util.Date
        "VARBINARY",		// java.io.Serializable
      };
    }
    { // JDATASTORE dialect...
        "BOOLEAN",		// boolean
        "SMALLINT",		// byte
        "CHAR(1)",		// char
        "DOUBLE",		// double
        "FLOAT",		// float
        "INTEGER",		// int
        "LONG",			// long
        "SMALLINT",		// short
        "VARCHAR",		// String
        "NUMERIC",		// java.math.BigDecimal
        "OBJECT",		// byte[]
        "DATE",			// java.sql.Date
        "TIME",			// java.sql.Time
        "TIMESTAMP",		// java.sql.Timestamp
        "TIMESTAMP",		// java.util.Date
        "OBJECT",		// java.io.Serializable
      };
    }
    { // ORACLE dialect...
        "NUMBER(1,0)",		// boolean
        "NUMBER(3,0)",		// byte
        "CHAR(1)",		// char
        "NUMBER",		// double
        "NUMBER",		// float
        "NUMBER(10,0)",		// int
        "NUMBER(19,0)",		// long
        "NUMBER(5,0)",		// short
        "VARCHAR2(2000)",	// String
        "NUMBER(38)",		// java.math.BigDecimal
        "RAW(255)",		// byte[]
        "DATE",			// java.sql.Date
        "DATE",			// java.sql.Time
        "DATE",			// java.sql.Timestamp
        "DATE",			// java.util.Date
        "RAW(255)",		// java.io.Serializable
      };
    }
    { // SYBASE/MSSQL dialects...
        "BIT",			// boolean
        "TINYINT",		// byte
        "CHAR(1)",		// char
        "FLOAT",		// double
        "REAL",			// float
        "INT",			// int
        "NUMERIC(19,0)",	// long
        "SMALLINT",		// short
        "TEXT",			// String
        "DECIMAL(28,28)",	// java.math.BigDecimal
        "IMAGE",		// byte[]
        "DATETIME",		// java.sql.Date
        "DATETIME",		// java.sql.Time
        "DATETIME",		// java.sql.Timestamp
        "DATETIME",		// java.util.Date
        "IMAGE",		// java.io.Serializable
      };
    }
    { // DB2 dialect...
        "SMALLINT",		// boolean
        "SMALLINT",		// byte
        "CHAR(1)",		// char
        "FLOAT",		// double
        "REAL",			// float
        "INTEGER",		// int
        "BIGINT",		// long
        "SMALLINT",		// short
        "VARCHAR(2000)",	// String
        "DECIMAL",		// java.math.BigDecimal
        "BLOB",			// byte[]
        "DATE",			// java.sql.Date
        "TIME",			// java.sql.Time
        "TIMESTAMP",		// java.sql.Timestamp
        "TIMESTAMP",		// java.util.Date
        "BLOB",			// java.io.Serializable
      };
    }
    { // INTERBASE dialect...
        "SMALLINT",		// boolean
        "SMALLINT",		// byte
        "CHAR(1)",		// char
        "DOUBLE PRECISION",	// double
        "FLOAT",		// float
        "INTEGER",		// int
        "NUMERIC(15,0)",	// long
        "SMALLINT",		// short
        "VARCHAR(2000)",	// String
        "NUMERIC(15,15)",	// java.math.BigDecimal
        "BLOB",			// byte[]
        "DATE",			// java.sql.Date
        "DATE",			// java.sql.Time
        "DATE",			// java.sql.Timestamp
        "DATE",			// java.util.Date
        "BLOB",			// java.io.Serializable
      };
    }
    { // INFORMIX dialect...
        "SMALLINT",		// boolean
        "SMALLINT",		// byte
        "CHAR(1)",		// char
        "FLOAT",		// double
        "SMALLFLOAT",		// float
        "INTEGER",		// int
        "DECIMAL(19,0)",	// long
        "SMALLINT",		// short
        "VARCHAR(2000)",	// String
        "DECIMAL(32)",		// java.math.BigDecimal
        "BYTE",			// byte[]
        "DATE",			// java.sql.Date
        "DATE",			// java.sql.Time
        "DATE",			// java.sql.Timestamp
        "DATE",			// java.util.Date
        "BYTE",			// java.io.Serializable
      };
    }

I ran into trouble using SQL keywords (like count or sum) for cmp fields or table names. Can the Container catch these?

It is unfortunate that every vendor adds their own keywords (e.g. Oracle: VARCHAR2) but we at least look for the SQL92 ones. As a workaround you can specify the name of the column associated with a CMP field explicitly, if you don't want to modify your code. For example, if you have a CMP field called "select", you can have this map to a column called "SLCT" via the following environment property:
<cmp-info>
            <database-map>
              <table>Data</table>
              <column-map>
                <field-name>select</field-name>
                <column-name>SLCT</column-name>
              </column-map>
            </database-map>
</cmp-info>

We have an integer field in a table that allows null values. If an entity bean is created for a row in the table and the integer value is null, an exception occurs.

If you have database values which may contain SQL nulls, then you must have Java data types which may contain Java nulls.

Typically, this means using an "Integer" instead of an "int", or a "Float" instead of a "float".