Showing posts with label Database Server (SQL). Show all posts
Showing posts with label Database Server (SQL). Show all posts

Database Terms

Database Terms

* Aggregation - Objects that are composed of other objects.
* Concurrency - Databases must ensure that data is checked when concurrent access is allowed. Concurrent access means more than one application or thread may be reading or updating the same data at the same time.
* Database Garbage Collection - Garbage collection is the process of destroying objects that are no longer referenced, and freeing the resources those objects used. In Java there is a background process that performs garbage collection. Requires bi-directional object relationships. Determines if the database performs garbage collection on objects that are no longer referenced by the database. This keeps external programs from having to track the use of object pointers.
* DBMS - Database management system.
* Distributed Architecture - Object are sharing in a distributed environment or the entire database may be replicated on multiple computers.
* DML - Data Manipulation Language separate from programming languages (for RDBMS) and used as a means of getting and storing data in the database.
* Encapsulation - Data with method storage. Not all databases support the methods but rely upon the classes defined in the schema to reconstruct the object with its methods.
* Fault tolerance - Features that provide for fault tolerence in the event of a hardware of software failure. Normally transaction processing provides software fault tolerance. Data replication to other servers on the network supports hardware fault tolerance.
* Heterogeneous environment - Cross Platform support - The database may be able to run on various builds of computers and with various operating systems.
* Inheritance - Objects inherit attributes from parent objects.
* JDBC - An application program interface (API). Calls are used to execute SQL operations.
* normalized - Elimination of redundancy in databases so that all columns depend on a primary key.
* Notification - Notification may be active or passive. A passive system can minimally determine if an object has changed state. An active system may provide for an application to be informed when an object is modified.
* Object relationships - Object relationships define association with other objects, and whether objects can detect each other in one direction or two directions. Two way object relationships may allow for garbage collection.
* ODBMS - Object Database management system.
* OQL - Object Query Language, is a data manipulation language for Object Databases although many object databases do not support it. They rely on object class extensions or interfaces for their support.
* Persistence - Databases provide persistance which for object databases means object can be stored between database runs.
* RDBMS - Relational database management system.
* Schema - The data structure of the database.
* SQL - Structured Query Language is a standard language for communication with a relational database management system (RDBMS). Structured Query Language, is a data manipulation language which is a standard for getting and storing data in an RDBMS.
* SQLJ - Supports structured query language (SQL) calls for Java. It consists of a language allowing SQL statements to be embedded in it, a translator, and a runtime model.
* Transaction processing - Some databases may have some form of transaction processing which may support concurrency. Transaction processing will ensure that the entire transaction is made or none of it is made. Transactions support concurrency and data recovery. A data failure will cause a rollback of data

ORDBMS Definition

ORDBMS Definition

An object relational database is also called an object relational database management system (ORDBMS). This system simply puts an object oriented front end on a relational database (RDBMS). When applications interface to this type of database, it will normally interface as though the data is stored as objects. However the system will convert the object information into data tables with rows and colums and handle the data the same as a relational database. Likewise, when the data is retrieved, it must be reassembled from simple data into complex objects.
Performance Constraints

Because the ORDBMS converts data between an object oriented format and RDBMS format, speed performance of the database is degraded substantially. This is due to the additional conversion work the database must do.
ORDBMS Benefits

The main benefit to this type of database lies in the fact that the software to convert the object data between a RDBMS format and object database format is provided. Therefore it is not necessary for programmers to write code to convert between the two formats and database access is easy from an object oriented computer language.

Object Oriented Database Standards

There are several object oriented standards and groups that oversee them.
Groups

* Object Management Group (OMG) - Develops standards to help make object applications to be portable and communicate between each other (interoperability). They have developed the Component Object Request Broker Architecture (CORBA) standard along with object and OODBMS interfaces.
* Object Database Management Group (ODMG) - Created to define standard interfaces for object databases. The interfaces should allow the databases and applications that use them be portable and communicate between each other.

Standards

* CASE Data Interchange Format (CDIF) - Defines standards for tools to use so tyey may be used with various applications such as database servers, application servers, and other tools.
* Portable Common Tool Environment (PCTE) Standard.
* PDES/STEP - Exchange format standard for product model data. Also called the object interface format.
* Object Query Language (OQL)
* Object Definition Language (ODL) - Extension of OMGs CORBA standard.

Object Database Use and Features

Databases provide persistence which for object databases means that objects can be stored between database runs.
Features

The following list of features are capabilities that object databases may support. Object database features include:

* Support of the object oriented language you want to use.
* Support of Object Oriented Concepts.
o Aggregation - Objects that are composed of other objects.
o Encapsulation - Data with method storage. Not all databases support the methods but rely upon the classes defined in the schema to reconstruct the object with its methods.
o Inheritance - Objects inherit attributes from parent objects.
o Polymorphism - Allows two methods to use the same name but have different behavior. Methods for one object can be defined, then the operation specification can be shared with other objects.
* Distributed Architecture - Object are sharing in a distributed environment or the entire database may be replicated on multiple computers.
* Heterogeneous environment - Cross Platform support - The database may be able to run on various builds of computers and with various operating systems.
* Transaction processing - Some databases may have some form of transaction processing which may support concurrency. Transaction processing will ensure that the entire transaction is made or none of it is made. Transactions support concurrency and data recovery. A data failure will cause a rollback of data.
* Concurrency - Databases must ensure that data is checked when concurrent access is allowed. Concurrent access means more than one application or thread may be reading or updating the same data at the same time. This may also be called two phase commit where two processes may work on the same object at the same time. This may use data locking for reads or writes. Some methods of concurrency control include:
o Pessimistic control - Whan one or more processes are reading, updated to the data cannot be made.
o Multiread - Updates are not blocked. Data must be consistant when the transaction was begun. In other words, if the read was done, and the data was changed by another process before the data is saved, the transaction is not valid until the data is read again.
* Object relationships - Object relationships define association with other objects, and whether objects can detect each other in one direction or two directions. Two way object relationships may allow for garbage collection. The best option is two way relationships.
* Database Garbage Collection - Requires bi-directional object relationships. Determines if the database performs garbage collection on objects that are no longer referenced by the database. This keeps external programs from having to track the use of object pointers.
* Relationship cardinality - Supported relationships may include any combination of:
o One to one.
o One to many.
o Many to one.
o Many to many.

The database should support all these.

* Transparent persistence - Consists of direct data manipulation using object oriented language. Many times a persistent capable class or persistent interface is used to implement persistence. This may be considered by some vendors to be transparency. If an interface is used, an intermediate interface may be used to help insulate calls from the particular database, thereby allowing the customer to more easily change database vendors later.
* Database interface methods - These may include SQL, OQL, and some application programming interface (API). See the section under "Communication Support", below.
* Database Integrity - There are two types:
o Structural database integrity ensures database contents are consistent with the database schema. Referential Integrity requires bi-directional object relationships to ensure objects do not contain references to deleted objects.
o Logical integrity - The logical properties of the data are correct. The data has the correct values consistently and concurrent access does not cause incorrect values to be set.
* Object Versioning - A single object represented by multiple versions. Two types are:
o Linear - Prior versions of the object are saved as the object is changed.
o Branch - Multiple users may update the object concurrently.
* Notification - Notification may be active or passive. A passive system can minimally determine if an object has changed state. An active system may provide for an application to be informed when an object is modified.
* Indexing - Additional indexing may be provided to enhance data retrieval efficiency. Hashing and b-trees may be used.
* Security - Data storage and/or transmission encryption may be supported by some databases. Also different authentication methods and levels for access to the database may be provided by various products.
* Archiving and data recovery
* Fault tolerance - Features that provide for fault tolerence in the event of a hardware of software failure. Normally transaction processing provides software fault tolerance. Data replication to other servers on the network supports hardware fault tolerance.
* Data access - Access is normally done using an iterator to access the data as though objects are collections. This way the objects are not required to be loaded into memory before the desired object is obtained.
* Sorting - All objects of a given class or parent class may be obtained.
* Tools that can be used with the database.
* Amount of storage.

Method storage- The code that runs in objects and gives them behavior is stored in the database.
Considerations

* What programming languages does the database support?
* Object relationships - Are they bi-directional?
* Work Group Support - Sharing databases and locking.
* Schema Evolution - How do you tell the database about schema changes? This includes changes to the definition of a class such as attributes or behavior, changes to inheritance, adding, deleting, or renaming a class. Do classes need to be backward compatible?
* How do databases search using polymorphism? Can it give all cars objects that are made by a specific manufacturer?
* How are the database APIs used? Is the database transparent to the applications?
* Tools - Tools are important for product development and support. The database may support some tools or integrate with some. Tools that should be a concern include development tools, testing tools, debugging tools, data modeling tools, and data maintenance tools.
* Object Models - The object modeling to be used and whether the object modeling tools integrate with the database (or whether they should) should be considered.
* Does the database store object methods or rebuild the methods from classes when required? If it does store methods, methods can be executed in database processes without storing the method or recreating the method in the application memory. Non object oriented programs may be able to access the output of the stored methods.

Communications Support

Object databases will use one or more of the following methods to exchange data between applications and the database.

* OQL - The standard language for object database communication is object query language (OQL). Some object databases support it and others do not.
* SQL - The standard language for relational database communication is structured query language (SQL). Some object databases support it and others do not. This is provided to help prospective customers migrate current applications from RDBMS to ODBMS. The object databases use SQL by considering a row an object, and each unit in a column to be an attribute of an object. The table is a collection of objects. The table joins and keys are used to create object relationships.
* Application Programming Interface (API) - Some vendors provide additional classes or programming interfaces that are used to access the database. It consists of direct data manipulation using object oriented language.

The advantage to using a standard interface such as SQL or OQL is that the application is not tied to one specific database. The advantage to using an API is that the access may be faster and possibly even transparent to the application. The application may not even know it is running methods or using data on a database. The API is a mixed bag since it gives some performance advantages and perhaps a little less flexibility. This loss of flexibility may be mitigated by providing a standard interface between the application and the particular database's API.

RDBMS Definition

RDBMS Definition

Relational databases store data in tables (relations) that are two dimensional. The tables have rows (records or objects) and columns (fields or attributes). Data items at an intersection of a row and a column are called a cell and consist of attribute values. Data stored is simple data such as integers, real numbers or string values. Multiple values may not be stored in one cell. Relational database tables are "normalized" so data is not repeated more often than necessary. All table columns depend on a primary key (a unique value in the column) to identify the column. Once the specific column is identified, data from one or more rows associated with that column may be obtained or changed.

Relational databases are sets of tables. One table file is not a relational database. A relational database server is not the same as a relational database. A relational database can be a file with sets of tables. The relational database server includes the ability to service requestesto get or change data from remote clients.

Relational database servers use Structured Query Language (SQL), as a data manipulation language to interface between itself and the clients. SQL is the standard for getting and storing data in an RDBMS. For information about SQL, see the "Beginner's SQL Guide".

Relational database servers provide:

* Data Management
* Transaction processing
* Data integrity - Provides for multiple access at the same time (concurrency) between multiple processes/users. This is done so data is not displayed nor saved in a fashion where one change is lost. Various locking mechanisms are used to support this.
* Data backup and recovery.
* Data security - Provides for user authentication, and levels of data access privileges.

Primary Keys

In relational databases, the data is not arranged in any particular order in tables. The data in tables requires keys for identification of rows. Each table has rows and columns. Sets of values in a row may describe a particular item such as customers. Consider the following table:

Object Oriented Databases

Object Oriented Databases

Object oriented databases are also called Object Database Management Systems (ODBMS). Object databases store objects rather than data such as integers, strings or real numbers. Objects are used in object oriented languages such as Smalltalk, C++, Java, and others. Objects basically consist of the following:

* Attributes - Attributes are data which defines the characteristics of an object. This data may be simple such as integers, strings, and real numbers or it may be a reference to a complex object.
* Methods - Methods define the behavior of an object and are what was formally called procedures or functions.

Therefore objects contain both executable code and data. There are other characteristics of objects such as whether methods or data can be accessed from outside the object. We don't consider this here, to keep the definition simple and to apply it to what an object database is. One other term worth mentioning is classes. Classes are used in object oriented programming to define the data and methods the object will contain. The class is like a template to the object. The class does not itself contain data or methods but defines the data and methods contained in the object. The class is used to create (instantiate) the object. Classes may be used in object databases to recreate parts of the object that may not actually be stored in the database. Methods may not be stored in the database and may be recreated by using a class.
Comparison to Relational Databases

Relational databases store data in tables that are two dimensional. The tables have rows and columns. Relational database tables are "normalized" so data is not repeated more often than necessary. All table columns depend on a primary key (a unique value in the column) to identify the column. Once the specific column is identified, data from one or more rows associated with that column may be obtained or changed.

To put objects into relational databases, they must be described in terms of simple string, integer, or real number data. For instance in the case of an airplane. The wing may be placed in one table with rows and columns describing its dimensions and characteristics. The fusalage may be in another table, the propeller in another table, tires, and so on.

Breaking complex information out into simple data takes time and is labor intensive. Code must be written to accomplish this task.
Object Persistence

With traditional databases, data manipulated by the application is transient and data in the database is persisted (Stored on a permanent storage device). In object databases, the application can manipulate both transient and persisted data.
When to Use Object Databases

Object databases should be used when there is complex data and/or complex data relationships. This includes a many to many object relationship. Object databases should not be used when there would be few join tables and there are large volumes of simple transactional data.

Object databases work well with:

* CAS Applications (CASE-computer aided software engineering, CAD-computer aided design, CAM-computer aided manufacture)
* Multimedia Applications
* Object projects that change over time.
* Commerce

Object Database Advantages over RDBMS

* Objects don't require assembly and disassembly saving coding time and execution time to assemble or disassemble objects.
* Reduced paging
* Easier navigation
* Better concurrency control - A hierarchy of objects may be locked.
* Data model is based on the real world.
* Works well for distributed architectures.
* Less code required when applications are object oriented.

Object Database Disadvantages compared to RDBMS

* Lower efficiency when data is simple and relationships are simple.
* Relational tables are simpler.
* Late binding may slow access speed.
* More user tools exist for RDBMS.
* Standards for RDBMS are more stable.
* Support for RDBMS is more certain and change is less likely to be required.

ODBMS Standards

* Object Data Management Group
* Object Database Standard ODM6.2.0
* Object Query Language
* OQL support of SQL92

How Data is Stored

Two basic methods are used to store objects by different database vendors.

* Each object has a unique ID and is defined as a subclass of a base class, using inheritance to determine attributes.
* Virtual memory mapping is used for object storage and management.

Data transfers are either done on a per object basis or on a per page (normally 4K) basis.

Server Types

Server Types

There are several kinds of databases which include:

* Flat file databases
* Relational databases
* Object databases
* Object relational databases

Flat files are simply files with a table of information which may be seperated by delimeters such as commas, colons, or semi-colons. Relational databases consist of several related tables of simple data. The tables are composed of rows and columns. Object databases store data in an object form rather than in tables. They store attributes and class information, but sometimes they also store and methods (behavior) in the database. Object relational databases are relational databases with data stored in tables, but they have a front end that converts objects to data and data to objects, making it seem to the application that objects are being stored.

Database servers include both a server program that serves remote clients and manages the database. They may use some means of standard communication between client and server to allow management of the data such as structured query language (SQL) for relational database servers.

The most popular databases today are Relational Database Management Systems (RDBMS). However, object database servers may someday overtake the relational database servers.

Relational Databases and Objects

If relational databases are used to store objects, the object must first be disassembled into parts, normalized, and placed in tables. This can take some time to do, and be a labor intensive process required for writing the code. To use the object, it must be reassembled.

Many relational databases are run on one single server and do not use a distributed architecture.

Object Database Servers

Object database servers may use an Object Query Language (OQL) as a standard language for communication. They may use an application programming interface (API) to allow the application to control the data or they may use both the API and OQL.

Current and Future Trends

Relational databases are still the most popular database in use today. There is good reason for this. They are easy to use and are normally efficient.

However as programming has changed, tools related to those changes must also change. Object oriented programming is becomming much more popular and as that occurs a more practical tool for long term storage of data is in demand. This tool must interface easily to the object oriented language in question. It must also be a standard tool so users are not tied to specific vendors amd should have a standard way of exchanging information between applications and the database. OQL was developed for this purpose, but it does not appear to be widely supported yet by object oriented database vendors.

Although object databases were first written many years ago, since they have not yet become popular, it appears that the market is not stable. There are several object oriented database vendors, and it is difficult to tell who will be in the market for the long haul. Therefore, I believe the purchase of an object oriented database is somewhat of a risk. This risk may be somewhat mitigated by the fact that programs can be written in object oriented language to isolate the programs from specific object database products.

If the benefits of the object oriented database are large enough for the particular application they are used for and the specific organization considering them, the risks are likely to be worth taking. However, if the benefits are marginal, it may be worth waiting another year or two for more market stability and uniformity