Home
IDS 410
Logical database design is the process of transforming the conceptual data model (Ch3 & 4) into a logical data model. Although there are other data models, the relational data model has been emphasized for two reasons:
The objective of logical database design is to translate the conceptual design (which represents an organization's requirements for data) into a logical database design that can be implemented on a chosen database management system.
The relational data model was first introduced in 1970 by E. F. Codd, then of IBM (Codd, 1970).
The relational data model represents data in the form of tables, and consists of the following three components:
Relation: A named, two-dimensional table of data. Each relation (or table) consists of a set of named columns and an arbitrary number of unnamed rows.
An attribute is a named column in a relation. Each row of a relation corresponds to a record that contains data (attribute) values for a single entity.
It is important to note that the term relation refers to structure and not content. In other words, any particular collection of rows and columns (records and attributes) is an instance of the relation (table) that holds it. We can express the structure of a relation by a shorthand notation, thus:
EMPLOYEE(Emp_ID,Name,Dept,Salary)
Primary Key: An attribute (or combination of attributes) that uniquely identifies each row in a relation. We designate a primary key by underlining the attribute name.
(The concept of primary key is related to the term "identifier" defined in Ch3.) The same attributes (or attributes) indicated as an entity's identifier in an E-R diagram may be the same attributes that compose the primary key in the relation representing that entity. There are, however, exceptions:
These situations are explored later.
Composite Key: A primary key that consists of more than one attribute.
Foreign Key: An attribute (possibly composite) in a relation of a database that serves as the primary key of another relation in that same database.
We have defined relations as two-dimensional tables of data. However, not all tables are relations. Relations have several properties that distinguish them from nonrelational tables:
The relational model includes several types of constraints, or business rules, whose purpose is to facilitate maintaining the accuracy and integrity of data in the database. The major types of integrity constraints are:
All the values that appear in a column of a relation must be taken from the same domain.
A domain is a set of values that may be assigned to an attribute. A domain specification usually consists of the following components: domain name, meaning, data type, size (or length), and allowable values or allowable range (if applicable).
The entity integrity rule is designed to assure that every relation has a primary key, and that the data values for that primary key are all valid. In particular, it guarantees that every primary key attribute is non-null. The entity integrity rule states that no primary key attribute (or component of a primary key attribute) may be null.
Null: A value that may be assigned to an attribute when no other value applies or when the applicable value is unknown. (In reality, a null is not a value but rather the absence of a value.)
Referential Integrity Constraint: Either each foreign key value must match a primary key value in another relationship or the foreign key value must be null. This rule maintains consistency among the rows of two (related) relations.
How do we know if a foreign key is allowed to be null?
Let's consider two relations, CUSTOMER and ORDER. We know that a customer can place an order and that an order must have been place by one and only one customer. We also know that the primary key of the CUSTOMER relation will be a foreign key in the ORDER relation.
Now, since every order has to have been placed by a customer, then the CUSTOMER foreign key in ORDER cannot be null. If the minimum cardinality had been zero, then the foreign key could be null.
Whether a foreign key can be null must be specified as the property of the foreign key attribute when the database is defined.
But what happens to order data if we choose to delete a customer who has submitted order? Three choices are possible:
We will see how these choices are implement in SQL later.
There are various techniques for defining and enforcing action assertions (or rules). These are discussed later.
Well-Structured Relation: A relation that contains minimal redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies.
Redundancies in a table may result in errors or inconsistencies (called anomalies) when a user attempts to update the data in the table.
Anomaly: An error or inconsistency that may result when a user attempts to update a table that contains redundant data. There are three types of anomalies: insertion anomaly, deletion anomaly, and modification anomaly.
When a regular entity type has a composite attribute, only the simple component attribute of the composite attribute are included in the new relation. For example, the Customer_Address attribute of the regular entity type CUSTOMER would be mapped (decomposed) to its simple components: Street, City, State and Zip in the corresponding relation.
When a regular attribute entity type contains a multivalued attribute, two new relations (rather than one) are created. The first relation contains all of the attributes of the entity type except the multivalued attribute. The second relation contains two attributes that form the primary key of the second relation. The first of these attributes is the primary key from the first relation, which becomes a foreign key in the second relation. The second is the multivalued attribute. The name of the second attribute should capture the meaning of the multivalued attribute.
(The fact that the second relation contains no nonkey attributes, or descriptors, provide an opportunity to suggest to users that new attributes be added to the relation.)
We recall that weak entities do not have complete identifiers, but must have an attribute called a partial identifier that permits distinguishing the various occurrences of the weak entity for each owner entity instance.
The procedure for representing relationships depends on both the degree of the relationship (unary, binary, ternary) and the cardinalities of the relationship.
Binary one-to-one relationships can be viewed as a special case of one-to-many relationships; there are two steps required in mapping such as a relationship
In a 1:1 relationship, the association in one direction is nearly always optional one, while the association in the other direction is mandatory one. We should include the foreign key of the entity type that has the mandatory participation in the 1:1 relationship in the relation on the optional side of the relationship . This approach will avoid the need to store null values in the foreign key attribute. Any attribute associated with the relationship itself is also included in the same relation as the foreign key. (See example below.)
Let's consider, for example, a 1:1 relationship between NURSE (mandatory side) and WARD (optional side) called In_charge. A Date_Assigned attribute will be neither an attribute of NURSE nor of WARD but of the relationship In_charge. So, we include Nurse_ID as a foreign key attribute in the WARD relation. We also include Date_Assigned in the WARD relation. This also underscores this fact that a non-M:N relationship could have an attribute.
Unary relationships are also called recursive relationships.
Recursive Foreign Key: A foreign key in a relation that references the primary key values of that same relation.
We recall from Ch3 that it is recommended that we convert a ternary relationship to an associative entity in order to represent participation constraints more accurately.
To map an associative entity type that links three regular entity types, we create a new associative relation. The default primary key consists of the three primary key attributes for the participating entity types (in some cases additional attributes are required to form a unique primary key). These attributes then serve as foreign keys that reference the individual primary keys of the participating entity types.
The foregoing apply to n-nary relationships.
The relational data model (relational schema) does not yet directly support supertype/subtype relationships. There are however various strategies that database designers can use to represent those relationships with the relational data model (Chouinard, 1989). We will use the following strategy to map supertype/subtype relationships.
Normalization is a formal process for deciding which attributes should be grouped together in a relation. So far, we have used "common sense" to group attributes into entity types during conceptual data modeling, and we subsequently mapped the E-R diagrams developed in the conceptual phase into relations (relational schema) in the logical phase. Before proceeding with physical design however, we need a tool to validate and improve our logical design, so that it satisfies certain constraints that avoid unnecessary duplication of data.
Normalization: The process of decomposing relations with anomalies to produce smaller, well-structured relations.
Normalization can be accomplished and understood in stages, each of which corresponds to a normal form.
Normal Form: A state of a relation that results from applying simple rules regarding functional dependencies (or relationships between attributes) to that relation.
There are six normal forms:
Normalization is based on the analysis of functional dependencies.
Functional Dependency: A constraint between two attributes or two sets of attributes. For any relationship R, attribute B is functionally dependent on attribute A if, for every valid instance of A, that value of A uniquely determines the value of B (Dutka and Hanson, 1989).
The functional dependency of B on A is represented by an arrow (A --> B). A attribute may be functionally dependent on two (or more) attributes.
Examples:
Determinant: The attribute on the left-side of the arrow in a functional dependency.
Candidate Key: An attribute, or combination of attributes, that uniquely identifies a row in a relation.
A candidate key must have the following properties:
The relationship between determinants and candidate keys can be summarized thus:
First Normal Form (1NF): A relation that contains no multivalued attributes.
Second Normal Form (2NF): A relation in first normal form in which every nonkey attribute is fully functionally dependent on the primary key.
A relation that is in first normal form will be in second normal form if and only if any one of the following conditions applies:
Partial Functional Dependency: A functional dependency in which one or more nonkey attributes are functionally dependent on part (but not all) of the primary key.
Third Normal Form (3NF): A relation that is in second normal form and has no transitive dependencies present.
Transitive Dependency: A functional dependency between two (or more) nonkey attributes.
As a part of the logical design process, normalized relations may have been created from a number of separate E-R diagram and (possibly) other user views. Some of the relations may be redundant; that is, they may refer to the same entities. If so, we should merge those relations (also called view integration). An understanding of how to merge relations is important for three reasons:
Synonyms: Two (or more) attributes having different names but the same meaning, as when they describe the same characteristics of an entity.
When merging relations containing synonyms, we should obtain agreement (if possible) from users on a single, standardized name for the attribute and eliminate any other synonyms.
Often, there is a need to allow some database users to refer to the same data by different names.
Alias: An alternative name used to refer to an attribute.
Homonyms: An attribute that may have more than one meaning.
When two 3NF relations are merged to form a single relation, transitive dependencies may result. We recall that a transitive dependency implies that a relationship in not in 3NF. We create a 3NF relation by removing the transitive dependency.
These relationships may be hidden in user views or relations. The task of the analyst is to correctly identify these relationships within different user views.
Charles E. Oyibo
IDS :: CBA
:: UIC