Object Database Use and Features

Databases provide persistence which for object databases means that objects can be stored between database runs.
Features

The following list of features are capabilities that object databases may support. Object database features include:

* Support of the object oriented language you want to use.
* Support of Object Oriented Concepts.
o Aggregation - Objects that are composed of other objects.
o Encapsulation - Data with method storage. Not all databases support the methods but rely upon the classes defined in the schema to reconstruct the object with its methods.
o Inheritance - Objects inherit attributes from parent objects.
o Polymorphism - Allows two methods to use the same name but have different behavior. Methods for one object can be defined, then the operation specification can be shared with other objects.
* Distributed Architecture - Object are sharing in a distributed environment or the entire database may be replicated on multiple computers.
* Heterogeneous environment - Cross Platform support - The database may be able to run on various builds of computers and with various operating systems.
* Transaction processing - Some databases may have some form of transaction processing which may support concurrency. Transaction processing will ensure that the entire transaction is made or none of it is made. Transactions support concurrency and data recovery. A data failure will cause a rollback of data.
* Concurrency - Databases must ensure that data is checked when concurrent access is allowed. Concurrent access means more than one application or thread may be reading or updating the same data at the same time. This may also be called two phase commit where two processes may work on the same object at the same time. This may use data locking for reads or writes. Some methods of concurrency control include:
o Pessimistic control - Whan one or more processes are reading, updated to the data cannot be made.
o Multiread - Updates are not blocked. Data must be consistant when the transaction was begun. In other words, if the read was done, and the data was changed by another process before the data is saved, the transaction is not valid until the data is read again.
* Object relationships - Object relationships define association with other objects, and whether objects can detect each other in one direction or two directions. Two way object relationships may allow for garbage collection. The best option is two way relationships.
* Database Garbage Collection - Requires bi-directional object relationships. Determines if the database performs garbage collection on objects that are no longer referenced by the database. This keeps external programs from having to track the use of object pointers.
* Relationship cardinality - Supported relationships may include any combination of:
o One to one.
o One to many.
o Many to one.
o Many to many.

The database should support all these.

* Transparent persistence - Consists of direct data manipulation using object oriented language. Many times a persistent capable class or persistent interface is used to implement persistence. This may be considered by some vendors to be transparency. If an interface is used, an intermediate interface may be used to help insulate calls from the particular database, thereby allowing the customer to more easily change database vendors later.
* Database interface methods - These may include SQL, OQL, and some application programming interface (API). See the section under "Communication Support", below.
* Database Integrity - There are two types:
o Structural database integrity ensures database contents are consistent with the database schema. Referential Integrity requires bi-directional object relationships to ensure objects do not contain references to deleted objects.
o Logical integrity - The logical properties of the data are correct. The data has the correct values consistently and concurrent access does not cause incorrect values to be set.
* Object Versioning - A single object represented by multiple versions. Two types are:
o Linear - Prior versions of the object are saved as the object is changed.
o Branch - Multiple users may update the object concurrently.
* Notification - Notification may be active or passive. A passive system can minimally determine if an object has changed state. An active system may provide for an application to be informed when an object is modified.
* Indexing - Additional indexing may be provided to enhance data retrieval efficiency. Hashing and b-trees may be used.
* Security - Data storage and/or transmission encryption may be supported by some databases. Also different authentication methods and levels for access to the database may be provided by various products.
* Archiving and data recovery
* Fault tolerance - Features that provide for fault tolerence in the event of a hardware of software failure. Normally transaction processing provides software fault tolerance. Data replication to other servers on the network supports hardware fault tolerance.
* Data access - Access is normally done using an iterator to access the data as though objects are collections. This way the objects are not required to be loaded into memory before the desired object is obtained.
* Sorting - All objects of a given class or parent class may be obtained.
* Tools that can be used with the database.
* Amount of storage.

Method storage- The code that runs in objects and gives them behavior is stored in the database.
Considerations

* What programming languages does the database support?
* Object relationships - Are they bi-directional?
* Work Group Support - Sharing databases and locking.
* Schema Evolution - How do you tell the database about schema changes? This includes changes to the definition of a class such as attributes or behavior, changes to inheritance, adding, deleting, or renaming a class. Do classes need to be backward compatible?
* How do databases search using polymorphism? Can it give all cars objects that are made by a specific manufacturer?
* How are the database APIs used? Is the database transparent to the applications?
* Tools - Tools are important for product development and support. The database may support some tools or integrate with some. Tools that should be a concern include development tools, testing tools, debugging tools, data modeling tools, and data maintenance tools.
* Object Models - The object modeling to be used and whether the object modeling tools integrate with the database (or whether they should) should be considered.
* Does the database store object methods or rebuild the methods from classes when required? If it does store methods, methods can be executed in database processes without storing the method or recreating the method in the application memory. Non object oriented programs may be able to access the output of the stored methods.

Communications Support

Object databases will use one or more of the following methods to exchange data between applications and the database.

* OQL - The standard language for object database communication is object query language (OQL). Some object databases support it and others do not.
* SQL - The standard language for relational database communication is structured query language (SQL). Some object databases support it and others do not. This is provided to help prospective customers migrate current applications from RDBMS to ODBMS. The object databases use SQL by considering a row an object, and each unit in a column to be an attribute of an object. The table is a collection of objects. The table joins and keys are used to create object relationships.
* Application Programming Interface (API) - Some vendors provide additional classes or programming interfaces that are used to access the database. It consists of direct data manipulation using object oriented language.

The advantage to using a standard interface such as SQL or OQL is that the application is not tied to one specific database. The advantage to using an API is that the access may be faster and possibly even transparent to the application. The application may not even know it is running methods or using data on a database. The API is a mixed bag since it gives some performance advantages and perhaps a little less flexibility. This loss of flexibility may be mitigated by providing a standard interface between the application and the particular database's API.