Home
IDS 410
E. F. Codd, A Relational Model of Data for Large Shared Data Banks. 1970.
An SQL-based relational database application involves a user interface, a set of tables in the database, and a relational database management system (RDBMS) with an SQL capability.
A relational DBMS (RDBMS) is a data management system that implements a relational data model, one where data are stored in a collection of tables, and the data relationships are represented by common values in related tables, not links.
The original purposes of the SQL standard were:
The benefits of using a standardized relational language include the following:
Catalog: A set of schemas that, when put together, constitute a description of a database.
Schema: That structure which contains descriptions of objects created by a user, such as base tables, views, and constraints, as part of the database.
SQL commands can be classified into three types:
Data Definition Language (DDL): Those commands used to define a database, including creating, altering, and dropping tables, and establishing constraints. [Used in the physical design and maintenance phases of the database development process]
Data Manipulation Language (DML): Those commands used to maintain and query a database, including updating, inserting, modifying, and querying data. [Used in the implementation phase of the database development process]
DML commands may be issued interactively, so that a result is returned immediately following execuation of the statement, or they may be included within programs written in a 3GL language such as C or COBOL. Embedding SQL commands may provide the programmer with more control over timing of report generation, interface appearance, error handling, and database security. This chapter focuses on basic DML commands in interactive (rather than embedded) format.
Data Control Language (DCL): Commands used to control a database, including administering (granting or revoking) priviledges and the committing (saving) of data. [Used in the implementation and maintenance phases of the database development process]
Sample Oracle8 Data Types
The basis syntax to create a database is:
CREATE SCHEMA database_name;
AUTHORIZATION owner_userid;
| CREATE SCHEMA |
|
| CREATE TABLE |
|
| CREATE VIEW |
|
| DROP SCHEMA |
|
| ALTER SCHEMA |
|
Other CREATE commands:
| CREATE CHARACTER SET | |
| CREATE COLLATION | |
| CREATE TRANSLATION | |
| CREATE ASSERTION | |
| CREATE DOMAIN |
See steps to follow when preparing to create a table:
General syntax of the create table statement used in data definition language:
CREATE TABLE tablename
({column definition [table constraint]}.,..
[ON COMMIT {DELETE | PRESERVE} ROWS]);
where column definition ::=
column_name
{domain name | datatype [(size)]}
[column_constaint_clause...]
[default value]
[collate clause]
and table constraint ::=
[CONSTRAINT constraint_name]
Constraint_type [constraint_attributes]
The purpose of a view is to simplify query commands; but a view may also provide valuable data security and significantly enhance programming productivity.
Base table: A table in the relational data model containing the inserted raw data. Base tables correspond to the relations that are identified in the database's conceptual schems.
Dynamic view: A virtual table that is created dynamically upon request by a user. A dynamic view is not a temporary table; rather, its definition is stored in the system catelog and the contents of the view are materialized as a result of an SQL query that uses the view.
Materialized view: Copies or replicas of data based on SQL queries created in the same manner as dynamic views. A materialized view exists as a table and must be synchronized with its associated base tables (refreshed) at appropriate intervals or events.
Consider an invoice that requires data from four tables: CUSTOMER_T, ORDER_T, ORDER_LINE_T, and PRODUCT_T. A (dynamic) view allows us to predefine this association into a single virtual table as part of the database so that if we need an invoice (a view), we do not have to reconstruct the joining of tables. Hence, we can define a view, INVOICE_V, by specifying an SQL query using the SELECT ... FROM ... WHERE construct:
Determine the data elements necessary to create an invoice for a customer. Save this query as a view named INVOICE_V.
CREATE VIEW INVOICE_V AS
SELECT CUSTOMER_T.CUSTOMER_ID, CUSTOMER_ADDRESS,
ORDER_T.ORDER_ID
PRODUCT_T.PRODUCT_ID, ORDERED_QUANTITY,
// and other columns
FROM CUSTOMER_T, ORDER_T, ORDER_LINE_T, PRODUCT_T
WHERE CUSTOMER_T.CUSTOMER_ID = ORDER_T.CUSTOMER_ID
AND ORDER_T.ORDER_ID = ORDER_LINE_T.ORDER_ID
AND PRODUCT_T.PRODUCT_ID = ORDER_LINE_T.PRODUCT_ID
Now we can use our newly-created view INVOICE_V in a query:
Generate an invoice for order number 1004.
SELECT CUSTOMER_ID, CUSTOMER_ADDRESS, PRODUCT_ID, QUANTITY,
// other columns as required
FROM INVOICE_V
WHERE ORDER_ID = 1004;
A view may join multiple tables or views together and may contain derived (or virtual) columns:
What is the total value of orders placed for each furniture product?
CREATE VIEW ORDER_TOTALS_V AS
SELECT PRODUCT_ID PRODUCT, SUM(UNIT_PRICE*QUANTITY) TOTAL
FROM INVOICE_V
GROUPBY PRODUCT_ID;
Idea to Ponder: "A view should be created for every single base table, even if that view is identical to the base table." Discuss the advantages and shortcomings of this position.See p. 270.
List all furniture products that have ever had a UNIT_PRICE over $300.
CREATE VIEW EXPENSIVE_STUFF_V AS
SELECT PRODUCT_ID, PRODUCT_NAME, UNIT_PRICE
FROM PRODUCT_T
WHERE UNIT_PRICE > 300
WITH CHECK OPTION;
We use the WITH CHECK OPTION clause to prevent a update that will remove a row from the view. (The view is designed to hold items that have had a unit price over $300 at least at one time. If we were to reduce the price of an item in the product table without the WITH CHECK OPTION constraint, that item will be removed from that view--obviously, not what we want.)
In Oracle8, the text of all views are stored in DBA_VIEWS. Users with system pridileges can find access information in DBA_VIEWS:
What information is available about the view named EXPENSIVE_STUFF?
SELECT * FROM DBA_VIEWS WHERE VIEW_NAME = 'EXPENSIVE_STUFF_V';
Materialized Views, like the dynamic views we have been discussing, can also be constructed in different ways for various purposes. Tables may be replicated in whole or in part, joined to create the materialized view, and updated on a predetermined time interval or triggered when the table needs to be accessed. The difference, to summarize, is that dynamic views are virtual views; they do not represent actual tables. Materialized views, on the other hand, are actual tables that are based on base tables.
Referential Integrity: An integrity constraint specifying that the value (or existence) of an attribute in one relation depends on the value (or existence) of a primary key in the same or another relation (or be NULL). [See Ch5 Definition.]
The SQL command REFERENCES prevents a foreign key value from being added if it is not already a valid value in the referenced primary key column. Problem: If the value of the primary key changes after we refer to it using a foreign key elsewhere, the connection between the two tables is lost.
Solutions:
Were a customer record to be deleted (rather than updated, as above), ON DELETE RESTRICT, CASCADE, or SET NULL are also available. In addition, there is an ON DELETE SET DEFAULT clause.
ALTER TABLE CUSTOMER_T ADD (TYPE VARCHAR(2));
The ALTER TABLE command may include keywords such as ADD, DROP, or ALTER and allow changing the column's names, datatype, lenght, and constraints.
To remove a table from a database, the owner of the table may use the DROP TABLE command. Views are dropped using the DROP VIEW COMMAND.
Drop the CUSTOMER_T table from its database schema:
DROP TABLE CUSTOMER_T;
Four data manipulation language commands are used in SQL:
While the INSERT, UPDATE, and DELETE commands allow modification of the data in the tables, it is the SELECT command that allows one to query the data contained in the tables.
If entering a value for every column in the table, we use a command like the following. (Note that the data values are ordered in the same order as the columns in the table.
Command: To insert a row of data into a table where value will be inserted for every attribute:
INSERT INTO CUSTOMER_T VALUES
(001, 'Contemporary Casuals', '13 S. Himes St.', 'Chicago', 'IL', 60565);
When data will not be entered into every column in the table, either enter the value NULL for the empty fields or specify those columns to which data are to be added.
To insert a row of data into a table where some attributes will be left null:
INSERT INTO PRODUCT_T (PRODUCT_ID, PRODUCT_NAME, PRODUCT_FINISH, UNIT_PRICE, ON_HAND) VALUES (1, 'End Table', 'Cherry', 175, 8);
In general, the INSERT command places a new row in a table based on values supplied in the statement, copies of one or more rows derived from other database data into the table, or extracts data from one table and inserts them into another.
Populating a table CA_CUSTOMER_T by using a subset of another table CUSTOMER_T that has the same structure:
INSERT INTO CA_CUSTOMER_T
SELECT * FROM CUSTOMER_T
WHERE STATE = 'CA';
In Oracle: SQL * Loader.
To delete rows that meet a certain criterion from the customer table:
DELETE FROM CUSTOMER_T WHERE STATE = 'HI';
To delete all rows from customer table:
DELETE FROM CUSTOMER_T;
SQL will actually eliminate the records selected by a DELETE command. Therefore, always execute a SELECT command first to display the records that would be deleted and visually verify that only the desired rows are included.
To modify the unit price and quantity on hand of a record in the PRODUCT table to 775 and 4, respectively:
UPDATE PRODUCT_T
SET UNIT_PRICE = 775, On_hand = 4
WHERE PRODUCT_ID = 7;
The SET command can also change a value to null; the syntax is SET columnname = NULL.
The internal schema of a relational database can be controlled for processing and storage efficiency. Some techniques used to tune the operational performance of the relational database internal model include:
Objective: To provide rapid random and sequential access to base-table data.
Command: To create an alphabetical index on customer name in the CUSTOMER table.
CREATE INDEX NAME_IDX ON CUSTOMER_T (CUSTOMER_NAME);
We can define an index as unique using the syntax: CREATE UNIQUE INDEX ...
Command: To remove the index on the customer name in the CUSTOMER table.
DROP INDEX NAME_INX;
While the UPDATE, INSERT, and DELETE commands allow modification of the data in the tables, the select command, with its various clauses, allow us to query the data contained in the tables and ask many different questions or ad hoc queries.
Although the basic construction of an SQL command is simple, SQL is a powerful tool that enables one to specify complex data analysis processes.
Note: Because the basic syntax is relatively easy to learn, it is also easy to write SELECT queries that are syntactically correct but do not answer the exact question that is intended. There are a few recommendations here: (1) Before running a query against a large production database, always test queries carefully on a small test set of data to be sure that they are returning the correct results. (2) In addition to checking the query results manually, it is often possible to parse queries into smaller parts, examine the results of these simpler queries, and then recombine them.
Most SQL data retrieval statements include the following three clauses:
The first two are required; the third is necessary when:
QUERY: Which product have a standard price of less than $275?
SELECT PRODUCT_NAME, STANDARD_PRICE FROM PRODUCT_V WHERE STANDARD_PRICE < 275;
Every SELECT statement returns a result table when it executes. Two special key words can be used along with the list of columns to display: DISTINCT and *.
NOTE: The order of the SELECT statement clauses is important. Also, if there is any ambiguity in the SQL command (e.g. if there is a CUSTOMER_ID in both the CUSTOMER_T and ORDER_T tables), we must indicate exactly from which table (or view) the requested data is to come (e.g. CUSTOMER_T.CUSTOMER_ID). Futher yet, when using data created by someone else, we must specify the owner of the table by adding the owner's user ID (e.g. OWNER_ID.CUSTOMER_T.CUSTOMER_ID.)
Query: What is the address of the customer named Home Furnishings? Use an alias, NAME, for customer name.
SELECT CUST.CUSTOMER_NAME AS NAME, CUST.CUSTOMER_ADDRESS
FROM ownerid.CUSTOMER_V CUST
WHERE NAME = 'Home Furnishings';
When using the SELECT statement, the columns can be rearranged so that they will be arranged differently in the result tbale than they were in the original table.
Query: List the unit price, product name, and product ID for all products in the PRODUCT table.
SELECT UNIT_PRICE, PRODUCT_NAME, PRODUCT_ID FROM PRODUCT_T;
Query: What is the total value for each product in inventory?
SELECT PRODUCT_NAME, UNIT_PRICE, ON_HAND, UNIT_PRICE * ON_HAND AS VALUE FROM PRODUCT_T;
Functions in SQL include: COUNT, MIN, MAX, SUM, and AVG.
Query: How many different items were ordered on number 1004?
SELECT COUNT (*)
FROM ORDER_LINE_V
WHERE ORDER_ID = 1004;
Consider the following query:
How many different items were ordered on number
1004? Also, show the Order_ID in the query result.
SELECT ORDER_ID, COUNT (*)
FROM ORDER_LINE_T
WHERE ORDER_ID = 1004;
The above query will generate an error message. The problem is that ORDER_ID
returns a value, 1004, for each of the two rows selected, while COUNT returns
one aggregate value, 2, for the set of rows with ID = 1004.
Query: Alphabetically, what is the first product name in the product table?
SELECT MIN (PRODUCT_NAME) FROM PRODUCT_T;
Query: Alphabetically, what is the last product name in the product table?
SELECT MAX (PRODUCT_NAME) FROM PRODUCT_T;
Wildcards may be used in the WHERE clause where the exact description of the desired item is unknown.
Query: Find all types of tables carried by Pine Valley Furniture
SELECT PRODUCT_ID, PRODUCT_NAME
FROM PRODUCT_T
WHERE PRODUCT_NAME LIKE '%Table';
Query: Find any products with any number of drawers
SELECT PRODUCT_ID, PRODUCT_NAME
FROM PRODUCT_T
WHERE PRODUCT_NAME LIKE '_-Drawer';
Comparison operators: =, >, >=, <, <=, <>, and !=
Query: Which order have been placed since 10/24/98?
SELECT ORDER_ID, ORDER_DATE
FROM ORDER_V
WHERE ORDER_DATE > '24-OCT-98';
Query: What furniture does Pine Valley carry that isn't made of cherry?
SELECT PRODUCT_DESCRIPTION, PRODUCT_FINISH
FROM PRODUCT_T
WHERE PRODUCT_FINISH != 'Cherry';
Boolean operators:
If multiple Boolean operators are used in an SQL statement, NOT is evaluated first, then AND, then OR.
Query: List product name, finish, and unit price for all desks and all tables that cost more than $300 in the product table.
SELECT PRODUCT_NAME, PRODUCT_FINISH, UNIT_PRICE
FROM PRODUCT_T
WHERE (PRODUCT_NAME LIKE ‘%Desk’ OR PRODUCT_NAME
LIKE ‘%Table’) AND UNIT_PRICE > 300;
Notice the parenthesis after WHERE and before AND. We place the parentheses there because we want to override the order of operation (AND before OR) and evaluate the OR expression before the AND expression, otherwise we would get an incorrect result. Convince yourself of this fact.
The comparison operators < and > are used to establish a range of values. The keywords BETWEEN and NOT BETWEEN can also be used.
Query: Which products in the PRODUCT_T have a unit price between $200 and $300?
SELECT PRODUCT_NAME, UNIT_PRICE
FROM PRODUCT_T
WHERE UNIT_PRICE BETWEEN 200 AND 300;
Or, alternatively:
SELECT PRODUCT_NAME, UNIT_PRICE
FROM PRODUCT_T
WHERE UNIT_PRICE > 199 AND UNIT_PRICE < 301;
Query: What are the distinct order numbers included in the ORDER_LINE table?
SELECT DISTINCT ORDER_ID FROM ORDER_LINE_T;
DISTINCT and its counterpart, ALL, can only be used once in a SELECT statement. It comes after SELECT and before any columns or expressions listed. If a SELECT statement projects more than one column (as with the sample below), only rows that are identical for every column will be eliminated.
Query: What are the distinct order numbers included in the ORDER_LINE table?
SELECT DISTINCT ORDER_ID, ORDERED_QTY FROM ORDER_LINE_T;
To match a list of values, consider using IN.
Query: List all customers who live in warmer states
SELECT CUSTOMER_NAME, CITY, STATE
FROM CUSTOMER_T
WHERE STATE IN ('FL', 'TX', 'CA', 'HI');
Now we extend the previous example by adding an ORDER BY clause
Query: List customer, city, and state for all customers in the customer table whose address is Florida, Texas, California, or Hawaii. List the customers alphabetically by state, and alphabetically by customer within each state.
SELECT CUSTOMER_NAME, CITY, STATE
FROM CUSTOMER_T
WHERE STATE IN ('FL', 'TX', 'CA', 'HI')
ORDER BY STATE, CUSTOMER_NAME;
In the result for this SQL statement, states are alphabetized first, then customer names. If sorting from high to low, we use the DESC keyword at the column used to sort. Furthermore, SQL-92 stipilates the null values (for the column used to sort) be placed first or last, before or after columns that have values. Where the NULLs will be placed will depend on the SQL implementation.
GROUP BY is particularly useful when pair with aggregage functions like SUM or COUNT. GROUP BY divides a table into subsets (by groups); then an aggregate function can be used to provide summary information for that group. The single value retruned by our previous aggregate function examples is called a scalar aggregate. When aggregate functions are used in a GROUP BY clause and several values are returned, they are called vector aggregates.
Scalar aggregate: A single value returned from an SQL query that includes an aggregate function.
Vector aggregate: Multiple values returned from an SQL query that includes an aggregate function.
Query: Count the number of customers with addresses in each state to which we ship.
SELECT STATE, COUNT (STATE)
FROM CUSTOMER_T
GROUP BY STATE;
It is also possible to nest groups within groups; the same logic is used as when sorting multiple items.
Query: Count the number of customers with addresses in each city to which we ship. List the city by state.
SELECT STATE, CITY, COUNT (CITY)
FROM CUSTOMER_T
GROUP BY STATE, CITY;
Note that when a GROUP BY clause is included, the columns allowed to be specified in the SELECT clause are limited. Only those columns with a single value for each group can be included. In the query above, each group consists of a city and its state. The SELECT statement includes both the 'city' and 'state' column. This works because each combination of city and state is one value... In general, the column referenced in the SELECT statement must be referenced in the GROUP BY clause, unless the column is an argument for an aggregate function included in the SELECT clause.
The HAVING clause acts like a WHERE clause, but it identifies groups that meet a criterion, rather than rows. Thus, one usually sees a HAVING clause following a GROUP BY clause.
Query: Find only states with more than one customer.
SELECT STATE, COUNT (STATE)
FROM CUSTOMER_T
GROUP BY STATE
HAVING COUNT (STATE) > 1;
This query returns a result with all states with greater than one customer.
NOTE that using WHERE in the previous example would not work because WHERE doesn't allow aggregates; further WHERE qualified ros, whereas HAVING qualifies groups.
To include more than one condition in the HAVING clause, we use AND, OR, and NOT just as in the WHERE clause. In summary, here is one last command that includes all of the six clauses; remember that they must be used in this (strict) order.
Query: List the product finish, average unit price for each finish, and number of products on hand for the following finishes: Cherry, Ash, Maple, and White, where average unit price is less than 750 and the quantity on hand is more than 2.
SELECT PRODUCT_FINISH, AVG (UNIT_PRICE), SUM (ON_HAND)
FROM PRODUCT_T
WHERE PRODUCT_FINISH IN ('Cherry', 'Ash', 'Maple', 'White')
GROUP BY PRODUCT_FINISH
HAVING AVG (UNIT_PRICE) < 750 AND SUM (ON_HAND) > 2
ORDER BY PRODUCT_FINISH;
Charles E. Oyibo
IDS :: CBA
:: UIC