paged books have been written during the last four decades. This article can not and does not intend to compete Show with any of these books, but rather tries to explain different database models in a non-academic style. I will discuss the following database models:
Flat Files Simply put, one can imagine a flat file database as one single large table. A good example to visualize this is to think of a spreadsheet. A spreadsheet can only have one meaningful table structure at a time. Data access happens only sequentially; random access is not supported. Queries are usually slow, because the whole file always has to be scanned to locate the data. Although data access could be sped up by sorting the data, in doing so the data becomes more vulnerable to errors in sorting. In addition you'll face following problems.
As a result one concludes that Flat File Databases are only suitable for small, self-evident amounts of data. Hierarchical Database Model Hierarchical databases (and network databases) were the predecessors of the relational database model. Today these models are hardly used in commercial applications. IBM's IMS (Information Management System) is the most prominent representative of the hierarchical model. It is mostly run on older mainframe systems. This model can be described as a set of flat files that are linked together in a tree structure. It is typically diagrammed as an inverted tree. The original concept for this model represents the data as a hierarchical set of Parent/Child relations. Data is basically organized in a tree of one or more groups of fields, that are called segments. Segments make up every single node of the tree. Each child segment can only be related to just one parent segment and access to the child segment could only happen via its parent segment. This means, that 1:n relations result in data redundancy. To solve this problem, the data is stored in only one place and is referenced through links or physical pointers. When a user accesses data he starts at the root level and works down his way through the tree to the desired target data. That's the reason why a user must be very familiar with the data structure of the whole database. But once he knows the structure, data retrieval could become very fast. Another advantage is built-in referential integrity, which is automatically enforced. However, because links between the segments are hard-coded into the database, this model becomes inflexible to changes in the data structure. Any change requires substantial programming effort, which in most cases comes along with substantial changes not only in the database, but also in the application. Network Database Model The network database model is an improvement to the hierarchical model. In fact it was developed to address some of the weaknesses of the hierarchical model. It was formally standardized as CODASYL DBTG (Conference On Data System Languages, Data Base Task Group) model in 1971 and is based on mathematical set theory. At its core the very basic modeling construct is the set construct. This set consists of an owner record, the set name, and the member record. Now, this 'member' can play this role in more than one set at the same time; therefore this member can have multiple parents. Also an owner type can be owner or member in one or more other set constructs. This means the network model allows multiple paths between segments. This is a valuable improvement on relationships, but could make a database structure very complex. Now how does data access happen? Actually, by working through the relevant set structures. A user need not work down his way through root tables, but can retrieve data by starting at any node and working through the related sets. This provides fast data access and allows the creation of more complex queries than could be created with the hierarchical model. But, once again, the disadvantage is that the user must be familiar with the physical data structure. Also, it is not easy to change the database structure without affecting the application, because if you change a set structure you need to change all references to that structure within the application. Although an improvement to the hierarchical model, this model was not believed to be the end of the line. Today the network database model is obsolete for practical purposes. Relational Database Model The theory behind the relational database model will not be discussed in this article; only the differences to the other models will be pointed out. I will discuss the relational model along with some set theory and relational algebra basics in a different set of articles in the near future, if they let me :). In the relational model the logical design is separated from the physical. Queries against a Relational Database Management System (RDBMS) are solely based on these logical relations. Execution of a query doesn't require the use of predefined paths like pointers. Changes to the database structure are fairly simple and easy to implement. The core concept of this model is a two-dimensional table, comprising of rows and columns. Because the data is organized in tables, the structure can be changed, without changing the accessing application. This is different to its predecessors, where the application usually had to be changed when the data structure changed. The relational database model knows no hierarchies within its tables. Each table can be directly accessed and can potentially be linked to each other table. There are no hard-coded, predefined paths in the data. The Primary Key - Foreign Key construct of relational databases is based on logical, not on physical links. Another advantage of the relational model is that it is based on a solid house of theory. The inventor, E.F.Codd, a mathematician by profession, has defined what a relational database is and what a system needs to call itself a relational database 1), 2). This model is firmly based on the mathematical theories of sets and first order predicate logic. Even the name is derived from the term relation which is commonly used in set theory. The name is not derived from the ability to establish relations among the table of a relational database. Object-Relational Model Also called post-relational model or extended relational model. This model addresses several weaknesses of the relational model. The most significant of which is the inability to handle BLOB's. BLOBs, LOBs or Binary Large Objects are complex data types like time series, geospatial data, video files, audio files, emails, or directory structures. An object-relational database system encapsulates methods with data structures and can therefore execute analytical or complex data manipulation operations. In it's most simple definition data is a chain of 0s and 1s, that are ordered in a certain manner. Traditional DBMSs have been developed for and are therefore optimized for accessing small data elements like numbers or short strings. These data are atomic; that is they could be not further cut down into smaller pieces. In contrast are BLOB's large, non-atomic data. They could have several parts and subparts. That is why they are difficult to represent in an RDBMS. Many relational databases systems do offer support to store BLOBs. But in fact, they store these data outside the database and reference it via pointers. These pointers allow the DBMS to search for BLOBs, but the manipulation itself happens through conventional IO operations. Object-Oriented Model According to Rao (1994), "The object-oriented database (OODB) paradigm is the combination of object-oriented programming language (OOPL) systems and persistent systems. The power of the OODB comes from the seamless treatment of both persistent data, as found in databases, and transient data, as found in executing programs." To a certain degree, one might think that the object-oriented model is a step forward into the past, because their design is like that of hierarchical databases. In general, anything is called an object, which can be manipulated. Like in OO programming objects inherit characteristics of their class and can have custom properties and methods. The hierarchical structure of classes and subclasses replaces the relational concept of atomic data types. As with OO programming the object-oriented approach tries to bring OO characteristics like classes, inheritance, and encapsulation to database systems, making a database in fact a data store. The developer of such a system is responsible for implementing methods and properties to handle the data in the database from within his object-oriented application. There is no longer a strict distinction between application and database. This approach makes Object DBMS an interesting and sometimes superior alternative to RDBMS when complex relationships between data are essential. An example of such an application might be current portfolio risk management systems. However, these systems lack a common, solid base of theory that Codd provided for relational databases. There is a model proposed by the Object Management Group (OMG), which could be viewed as de facto standard, but the OMG can only advise and is not a standards body like ANSI. Other Models For the sake of "completeness" what now follows is a list of other models without further detailed explanation:
Conclusion Well, no real conclusion. I hope you are with me so far and have seen that there (still) is a world outside the relational model, |