Wednesday, February 13, 2008

Few Q/A

How do you implement one-to-one, one-to-many and many-to-many relationships while designing tables?

One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships.
One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships.
Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the junction table.

What's the difference between a primary key and a unique key?

Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where are unique creates a nonclustered index by default. Another major difference is that, primary key doesn't allow NULLs, but unique key allows one NULL only.


Define candidate key, alternate key, composite key.

A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes the primary key of the table. If the table has more than one candidate key, one of them will become the primary key, and the rest are called alternate keys.

A key formed by combining at least two or more columns is called composite key.


What are defaults? Is there a column to which a default can't be bound?

A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns and timestamp columns can't have defaults bound to them.


What is a transaction and what are ACID properties?

A transaction is a logical unit of work in which, all the steps must be performed or none. ACID stands for Atomicity, Consistency, Isolation, Durability. These are the properties of a transaction.

Atomicity
A transaction must be an atomic unit of work; either all of its data modifications are performed, or none of them is performed.

Consistency
When completed, a transaction must leave all data in a consistent state. In a relational database, all rules must be applied to the transaction's modifications to maintain all data integrity. All internal data structures, such as B-tree indexes or doubly-linked lists, must be correct at the end of the transaction.

Isolation
Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions. A transaction either recognizes data in the state it was in before another concurrent transaction modified it, or it recognizes the data after the second transaction has completed, but it does not recognize an intermediate state. This is referred to as serializability because it results in the ability to reload the starting data and replay a series of transactions to end up with the data in the same state it was in after the original transactions were performed.

Durability
After a transaction has completed, its effects are permanently in place in the system. The modifications persist even in the event of a system failure.



What's the difference between DELETE TABLE and TRUNCATE TABLE commands?

DELETE TABLE is a logged operation, so the deletion of each row gets logged in the transaction log, which makes it slow. TRUNCATE TABLE also deletes all the rows in a table, but it won't log the deletion of each row, instead it logs the deallocation of the data pages of the table, which makes it faster. Of course, TRUNCATE TABLE can be rolled back.


What is an index? What are the types of indexes? How many clustered indexes can be created on a table? I create a separate index on each column of a table. what are the advantages and disadvantages of this approach?

Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the data quicker.

Indexes are of two types. Clustered indexes and non-clustered indexes. When you craete a clustered index on a table, all the rows in the table are stored in the order of the clustered index key. So, there can be only one clustered index per table. Non-clustered indexes have their own storage separate from the table data storage. Non-clustered indexes are stored as B-tree structures (so do clustered indexes), with the leaf level nodes having the index key and it's row locater. The row located could be the RID or the Clustered index key, depending up on the absence or presence of clustered index on the table.

If you create an index on each column of a table, it improves the query performance, as the query optimizer can choose from all the existing indexes to come up with an efficient execution plan. At the same t ime, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.


What are the steps you will take to improve performance of a poor performing query?

This is a very open ended question and there could be a lot of reasons behind the poor performance of a query. But some general issues that you could talk about would be: No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization, excess usage of cursors and temporary tables.

Some of the tools/ways that help you troubleshooting performance problems are: SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical execution plan in Query Analyzer.


What is a deadlock and what is a live lock? How will you go about resolving deadlocks?

Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a lock on the other's piece. Each process would wait indefinitely for the other to release the lock, unless one of the user processes is terminated. SQL Server detects deadlocks and terminates one user's process.

A livelock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared locks. A livelock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely.


What are statistics, under what circumstances they go out of date, how do you update them?

Statistics determine the selectivity of the indexes. If an indexed column has unique values then the selectivity of that index is more, as opposed to an index with non-unique values. Query optimizer uses these indexes in determining whether to choose an index or not while executing a query.

Some situations under which you should update statistics:
1) If there is significant change in the key values in the index
2) If a large amount of data in an indexed column has been added, changed, or removed (that is, if the distribution of key values has changed), or the table has been truncated using the TRUNCATE TABLE statement and then repopulated
3) Database is upgraded from a previous version



• What are the temporary tables? What are temporary variables? Where they are stored? What is the purpose of Temporary tables? Can we use a temporary table that is created in one stored procedure in another stored procedure? What is the scope of temp tables?


Temporary tables are used to hold temporary data for the connection that creates the table. These tables get created in the tempdb system database.

There is one more interesting difference from a permanent – a foreign key cannot be defined on a temporary table.

No we cannot use a temp table created in another SP.

The temp table exists only for the connection that creates it. No other connection can see the same temp table.

• What are the indexes? Types of indexes? What is the difference between the clustered and non-clustered index? When they are used?

An Index stores the location of the row being looked up. It speeds up data retrieval as the engine does not have to perform a complete ‘table scan’ to find the row being queried.

A clustered index physically sorts the table upon the member columns. A non-clustered index stores the key value and the row index where to find that row.

A nice example to explain this difference is a book index that stores terms with page numbers and behaves like non-clustered index; while a phone book stores names and numbers – if the number is stored with the name – behaves like a clustered index.


• What are the advantages of Stored Procedures? When are they used?

1. Stored procedure execution plans can be reused, staying cached in SQL Server's memory, reducing server overhead.
2. Reduce network traffic
3. Provide better security to your data. If you use stored procedures exclusively, you can remove direct SELECT, INSERT, UPDATE, and DELETE rights from the tables and force developers to use stored procedures as the method for data access.
4. Encapsulate business logic or decision making processes

• What is Query Plan? Can you see a query plan? Where it is stored?

A query plan details how the SQL Query Optimizer will run the query. The optimizer plans which portion of the query to execute first and which index to use. Depending on the way the tables are set up, the nature of the query, the indexes available, etc. SQL Server will determine the optimal technique - be it to scan through the table from start to finish, to use one or more specific indexes or various other ways.

SET SHOWPLAN_TEXT ON – TSQL, Include Actual Execution Plan - SSMS

The query plan is stored in the SQL Server memory’s Procedure Cache.

• What is pre-compilation?

Whenever a stored procedure is executed for the first time, its execution plan is compiled and stored in the SQL Server’s procedure cache. For any subsequent run of the stored procedure, it doesn’t need to compile it again as it is already pre-compiled. This is unlike any embedded statement in the application that needs to be compiled each time it is fired.

• What is the SQL server Database structure? What are four types of databases?
• What is the use of tempdb database?


Stores temporary data (intermediate results of large sort operations and server’s internal operations) and database objects (temporary tables, stored procedures)