Advance Database Design

Prof Madya Nazib Nordin

Question 1

What indexing techniques are supported in SQL Server?

An index in SQL is built like a tree, consisting of a number of nodes. SQL can find the certain row without browsing through all the rows therefore; the time spent searching for rows can be reduced.

SQL allows B-tree indexes to be defined for one or more columns within the table, which support fast access to the data through the column values. The SELECT command, which retrieves the data, makes such decision based on the available indexes. A B-tree is useful as index structures for data held on external files on disk Its provided direct access to the records through their key values. B-tree is more complex structure compared to indexed sequential file but it does not suffer from the overflow problems.

eg:

Consider a B-tree of degree 5, which is a balanced five-way tree where each node can hold four data values and have five branches.

The values are:

10 50 6 29 64 11 13 18 19 72 88 17 77 102 98

The first value, 10 is placed in a new node, which has room for the three other values. The 50 is placed in the same node, so are the 6 and 29 but in ordered as shown below:

When a 64 value is to be added, there is no room in the node. Because a 29 value is the median, the node is split into two separate leaf nodes with a new parent node holding the value, 29.

 

 

 

 

 

 

The following values, 11 and 13 is to be placed in the left node.

 

 

 

 

When an 18 value is to be added, there is no room in the left node. Median value, 11 are choosing to be the parent mode. The value 29 and its adjacent pointer along to make new room for the 11. The leaf node splits into two to provide the first two branches.

 

 

 

 

The items following are added to the B-tree in a similar way. The final result is the tree:

 

 

 

 

 

If two further item, 26 and 100 are to be added, 26 should be placed in the second leaf node. Since there is no room, the node splits, adding a new value into the root.

 

 

 

 

The value 100 against splits its leaf, since there is no room there for it. The median value 98 is moved up to be added to the parent node. Because there is no room there, the parent node splits to give a new root node:

 

 

 

 

 

 

 

The operations on a B-tree such as deletion or addition new item may affect several levels of the tree.

Question 2

What alternatives for data entries are supported in SQL Server?

Once a table has been created, before any useful work can be done with it data must be entered into it. Data Entry in SQL Server includes :

(i) Inserting Rows in a Table

A quick method of data entry is the INSERT...VALUES command in SQL Server. This command has the form: INSERT INTO tablename VALUES datalist

The datalist is a series of data items, separated by commas and enclosed in parentheses, that together make up one row of the table specified by tablename. An example of INSERT...VALUES using the MODEL table could be:

SQL> INSERT INTO MODEL VALUES 2 ('M0020','Vauxhall','Belmont',4,2.8);

Generally, this method is error prone and time consuming and is usually avoided. However, you can use a substitution variable in a command file or SELECT statement to represent values to be provided at run time. A substitution variable is prefixed by a single ampersand (&), for example:

SQL> INSERT INTO MODEL VALUES 2 '&MODELNO','&MAKE','&MODELNAME',&DOORS,&LITRES);

Enter value for MODELNO :M0020

Enter value for MAKE :Vauxhall

Enter value for MODELNAME:Belmont

Enter value for DOORS :4

Enter value for LITRES :2.8

old 2: ('&MODELNO','&MAKE','&MODELNAME',&DOORS,&LITRES)

new 2: ('M0020','Vauxhall','Belmont',4,2.8)

1 record created.

Press / or R to run the same command again, and SQL will repeat the prompts for you to enter data for the next record.

You can also use INSERT to copy selected data values from one table to another.

For example,

insert into tablename1

select colname, colname...

from tablename2

where condition

In order to ensure that your data is saved to the database, enter the command SET AUTOCOMMIT ON at the SQL> prompt.

(ii) Creating Sequence Numbers

The sequence generator generates sequence numbers. These are useful to generate unique primary keys for your data automatically, and to coordinate keys across multiple rows or tables. Use the SQL command CREATE SEQUENCE to create a sequence.

The following statement creates a sequence used to generate customer numbers for the custno column in the custome_dbf table:

Create sequence cust_sequence

increment by 1

start with 1

nomaxvalue

nocycle

cache 10;

Notice that several parameters can be specified to control the function of sequences. You can use these parameters to indicate the starting point of the sequence, the minimum and maximum values, and the interval between sequence values. The nocycle option indicates that the sequence cannot generate more values after reaching its maximum or minimum value.

The cache option pre-allocates a set of sequence numbers and keeps them in memory so they can be accessed faster. When the last of the sequence numbers has been used, another set of numbers is read into the cache.

You can change any of the parameters that define how corresponding sequence numbers are generated; however, you cannot alter a sequence to change its starting number. To do this, you have to drop the sequence and recreate it.

An example of altering a sequence is:

alter sequence cust_sequence

increment by 10

maxvalue 10000;

To use a sequence once it has been defined, you use the NEXTVAL and CURRVAL pseudo-columns; each new sequence number is generated by a reference to the sequence's pseudo-column NEXTVAL; the current sequence number can be referenced using the pseudo-column CURRVAL.

To generate and use a sequence number, reference seq_name.NEXTVAL. For example:

insert into custome_dbf (custno, custname,...)

values (cust_sequence.nextval,'K Smith',....)

You can use CURRVAL to insert a row into the corresponding buys table. If you have generated the above number in the custome_dbf table, the value in CURRVAL will be the same. Therefore, you can refer to CURRVAL to insert the same value into buys:

insert into buys (engineno, regno, custno,....)

values (12222,'K514RHD',cust_sequence.currval,...)

CURRVAL can only be used if seq_name.nextval has been referenced in the current user session. CURRVAL can be referenced as many times as necessary. The next sequence number is not generated until NEXTVAL is referenced.

CURRVAL and NEXTVAL can be used in the following places:

To drop a sequence:

drop sequence cust_sequence;

(iii) Updating a Table

To change the data in a specified table:

update tablename

set column=expr [column=expr]...

[where condition]

For example, to increase the sellprice of a particular car in the BUYS table:

update buys

set sellprice = sellprice+100

where regno='F69BFC';

(iv) Deleting Rows from a Table

To delete all rows from a table:

delete from tablename;

To delete selected rows from a table:

delete from tablename

where condition;

Note: To enter, edit and delete records quickly, it is much easier to use Forms to create for each table or set of tables a simple form that you can use for data entry etc. For more information, refer to the Computing Services document ORACLE-UG-6, Oracle Forms 4.5: Guide to Designing and Running Forms.

Question 3

Are clustered index supported in SQL Server?

SQL Server supports clustered and nonclusted types of indexes. These are used to traverse relational tables with minimal reads, thereby increasing efficiency and response time. Maintenance of indexes is minimal in the SQL Server and is managed by the database server software.

A table can have only one clustered index. In a clustered index, the physical order of the data records is identical to the sort order specified by the clustered index.

Clustered index is used to dictate the physical ordering of the table's rows. As new rows inserted, they are forced into clustered index order, which is sometime result in page spawning, or splitting in order to insert the row into the correct location.

A table may have many unclusetered indexes. This type of index allows table data to be accessed in different orders without actually changing the physical order of the record in the table.

Unclustered index is used to provide access to the table's data in an alternate order. Its allow users to look at the table's data in more than one way. Sybase support up to 249 nonclustered indexes per table.

DDL to create the tables, clustered and nonclustered indexes

CREATE TABLE student_grade

(student_id int,

student_name char(25),

course char(20,

grade tinyint,

semester char(15)

)

CREATE clustered index student_id

ON student_grade (student_id)

CREATE nonclustered index student_course

ON student_grade (course, term, student_name)

CREATE nonclustered index student_name

ON student_grade (student_name)

Question 4

What file organization are supported?

File organization is important to more effective searching data. It allows direct access to records in the data file. SQL has two methods to accessing row in a table: Sequencial and Direct access.

(i) Sequential

Efficient if only one row sought but if the table has many rows, this method is very time- consuming and inefficient. Sequential search is the easiest but least efficient search techniques in file organization. This techniques is used:

  1. When records are stored without any consideration given to order.
  2. When the storage medium lack any type of direct access facility.

A sequential file is the simplest organization for a file. It is a sequence of records. The records may or may not be kept in sorted order in the sequence. In SQL Server all files are in sequential files and consists of records of the same type.

(ii) Direct access

Read only the rows, which exhibit the requirement characteristics. To do this, an index is necessary

References

Korsh, James F. (1986) Data Structures, Algorithms and Program Style. PWS Computer Prentice, Boston.

URL : http://escher.cs.ucdavis.edu/ECS165A/lectures/lect09.html