DBCC SHOWCONTIG(MyTable1)
GO
DBCC SHOWCONTIG scanning 'MyTable1' table...
Table: 'MyTable1' (1556968673); index ID: 1, database ID: 16
TABLE level scan performed.
- Pages Scanned................................: 18986
- Extents Scanned..............................: 2443
- Extent Switches..............................: 9238
- Avg. Pages per Extent........................: 7.8
- Scan Density [Best Count:Actual Count].......: 25.70% [2374:9239]
- Logical Scan Fragmentation ..................: 44.58%
- Extent Scan Fragmentation ...................: 87.07%
- Avg. Bytes Free per Page.....................: 1658.7
- Avg. Page Density (full).....................: 79.51%
DBCC execution completed. If DBCC printed error messages,
contact your system administrator.
* There were 18,986 pages examined to create the report. Those pages existed within 2,443 extents
* While examining the pages for fragmentation, the server had to switch extent locations 9,238 times.
The Scan Density restates this by indicating the percentage of all pages within all extents were contiguous. In an ideal environment, the density displayed would be close to 100.
* The Logical Scan Fragmentation and Extent Scan Fragmentation are indications of how well the indexes are stored within the system when a clustered index is present (and should be ignored for tables that do not have a clustered index).
In both cases, a number close to 0 is preferable.
* There are an average of 1659 bytes free per page, or that each page is 79.51% utilized. The closer that number gets to 100, the faster the database is able to read in records, since more records exist on a single page. However, this must be balanced with the cost of writing to the table. Since a page split will occur if a write is required on a page that is full, the overhead can be tremendous.
After DBCC INDEXDEFRAG/DBREINDEX
DBCC SHOWCONTIG scanning 'MyTable1' table...
Table: 'MyTable1' (1556968673); index ID: 1, database ID: 16
TABLE level scan performed.
- Pages Scanned................................: 15492
- Extents Scanned..............................: 1945
- Extent Switches..............................: 2363
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 81.94% [1937:2364]
- Logical Scan Fragmentation ..................: 15.43%
- Extent Scan Fragmentation ...................: 20.15%
- Avg. Bytes Free per Page.....................: 159.8
- Avg. Page Density (full).....................: 98.03%
DBCC execution completed. If DBCC printed error messages,
contact your system administrator.
Saturday, October 27, 2007
Page it
We are most familiar with the data row. The row size is set only by the definition of the table that holds it (e.g. A table of addresses require more data per row then a table of class names). In SQL Server, a table may define a row as storing as little as 4 bytes to as much as 8060.This limit is set by the size of the data page, which stores up to 8,192 bytes (8 KB). The remaining 132 bytes are used by SQL Server to track other information under the covers. Although SQL Server is designed around 8 KB pages, the smallest unit of data that SQL Server can allocate is 64 KB. This is called an extent.
To store the data in a sorted order, as in a phone book, SQL Server uses something called a clustered index. When a typical database is created, clustered indexes exist on nearly all tables. However, just because the data exists in sorted order within the page does not mean that it exists as such within an extent. The reason for this derives from situations in which there is no more room on a given page in which it can insert a row. SQL Server then removes approximately half the page and moves it to another page, which is called a Page Split (Page Splits will not occur with clustered indexes on IDENTITY based columns, but hotspotting may). In some cases, it may move that data to another extent altogether, possibly even allocating a new extent to do so. So, while we start off with names beginning with A and ending with H on one page, and names beginning with I and ending with Z on the next page, through usage, we may see that names A through C are now located on one page in one extent, D through E on another extent and S through Z back on the fifth page of the first extent, etc. It is because of the page split that there are times in which we may prefer to use tables with no clustered indexes at all. However, these tables are usually scratch tables which are highly volatile. In those situations, we desire the quicker write times at the cost of slower reads.
To store the data in a sorted order, as in a phone book, SQL Server uses something called a clustered index. When a typical database is created, clustered indexes exist on nearly all tables. However, just because the data exists in sorted order within the page does not mean that it exists as such within an extent. The reason for this derives from situations in which there is no more room on a given page in which it can insert a row. SQL Server then removes approximately half the page and moves it to another page, which is called a Page Split (Page Splits will not occur with clustered indexes on IDENTITY based columns, but hotspotting may). In some cases, it may move that data to another extent altogether, possibly even allocating a new extent to do so. So, while we start off with names beginning with A and ending with H on one page, and names beginning with I and ending with Z on the next page, through usage, we may see that names A through C are now located on one page in one extent, D through E on another extent and S through Z back on the fifth page of the first extent, etc. It is because of the page split that there are times in which we may prefer to use tables with no clustered indexes at all. However, these tables are usually scratch tables which are highly volatile. In those situations, we desire the quicker write times at the cost of slower reads.
The row count question
select rowcnt from sysindexes where id = object_id('tablename') and indid = 1 --assuming there exist a clustered index
select rowcnt from sysindexes where id = object_id('tablename') and indid = 0 --assuming table is a heap
Run following to fix errors with non-clustered indexes:
dbcc updateusage(0,'tablename') with count_rows
select rowcnt from sysindexes where id = object_id('tablename') and indid = 0 --assuming table is a heap
Run following to fix errors with non-clustered indexes:
dbcc updateusage(0,'tablename') with count_rows
Query plan?
A query plan details how the SQL Query Optimizer will run the query. The optimizer plans which portion of the query to execute first and which index to use. Depending on the way the tables are set up, the nature of the query, the indexes available, etc. SQL Server will determine the optimal technique - be it to scan through the table from start to finish, to use one or more specific indexes or various other ways.
SET SHOWPLAN_TEXT ON – TSQL, Include Actual Execution Plan - SSMS
The query plan is stored in the SQL Server memory’s Procedure Cache.
SET SHOWPLAN_TEXT ON – TSQL, Include Actual Execution Plan - SSMS
The query plan is stored in the SQL Server memory’s Procedure Cache.
Much ado about Stored procedures
1. Stored procedure execution plans can be reused, staying cached in SQL Server's memory, reducing server overhead.
2. Reduce network traffic
3. Provide better security to your data. If you use stored procedures exclusively, you can remove direct SELECT, INSERT, UPDATE, and DELETE rights from the tables and force developers to use stored procedures as the method for data access.
4. Encapsulate business logic or decision making processes
Whenever a stored procedure is executed for the first time, its execution plan is compiled and stored in the SQL Server’s procedure cache. For any subsequent run of the stored procedure, it doesn’t need to compile it again as it is already pre-compiled. This is unlike any embedded statement in the application that needs to be compiled each time it is fired.
2. Reduce network traffic
3. Provide better security to your data. If you use stored procedures exclusively, you can remove direct SELECT, INSERT, UPDATE, and DELETE rights from the tables and force developers to use stored procedures as the method for data access.
4. Encapsulate business logic or decision making processes
Whenever a stored procedure is executed for the first time, its execution plan is compiled and stored in the SQL Server’s procedure cache. For any subsequent run of the stored procedure, it doesn’t need to compile it again as it is already pre-compiled. This is unlike any embedded statement in the application that needs to be compiled each time it is fired.
Indexes, anyone?
An Index stores the location of the row being looked up. It speeds up data retrieval as the engine does not have to perform a complete ‘table scan’ to find the row being queried.
A clustered index physically sorts the table upon the member columns. A non-clustered index stores the key value and the row index where to find that row.
A nice example to explain this difference is a book index that stores terms with page numbers and behaves like non-clustered index; while a phone book stores names and numbers – if the number is stored with the name – behaves like a clustered index.
A clustered index physically sorts the table upon the member columns. A non-clustered index stores the key value and the row index where to find that row.
A nice example to explain this difference is a book index that stores terms with page numbers and behaves like non-clustered index; while a phone book stores names and numbers – if the number is stored with the name – behaves like a clustered index.
Little on Temporary tables
Temporary tables are used to hold temporary data for the connection that creates the table. These tables get created in the tempdb system database.
There is one more interesting difference from a permanent – a foreign key cannot be defined on a temporary table.
The temp table exists only for the connection that creates it. No other connection can see the same temp table.
Difference between temporary tables and table variables:
1. Table variables are transaction neutral, i.e., transaction logs are not maintained for changes to temporary variables
2. Table variables exist only in the same scope as variables. Contrary to the temporary tables, they are not visible to inner stored procedures and exec (string) statements. Also, they cannot be used in an insert/exec statement.
create table #T (s varchar(128))
declare @T table (s varchar(128))
insert into #T select 'old value #'
insert into @T select 'old value @'
begin transaction
update #T set s='new value #'
update @T set s='new value @'
rollback transaction
select * from #T
select * from @T
s
---------------
old value #
s
---------------
new value @
There is one more interesting difference from a permanent – a foreign key cannot be defined on a temporary table.
The temp table exists only for the connection that creates it. No other connection can see the same temp table.
Difference between temporary tables and table variables:
1. Table variables are transaction neutral, i.e., transaction logs are not maintained for changes to temporary variables
2. Table variables exist only in the same scope as variables. Contrary to the temporary tables, they are not visible to inner stored procedures and exec (string) statements. Also, they cannot be used in an insert/exec statement.
create table #T (s varchar(128))
declare @T table (s varchar(128))
insert into #T select 'old value #'
insert into @T select 'old value @'
begin transaction
update #T set s='new value #'
update @T set s='new value @'
rollback transaction
select * from #T
select * from @T
s
---------------
old value #
s
---------------
new value @
Wednesday, September 19, 2007
Count Rows
select rowcnt from sysindexes where id = object_id('tablename') and indid = 1 --assuming there exist a clustered index
select rowcnt from sysindexes where id = object_id('tablename') and indid = 0 --assuming table is a heap
Run following to fix errors with non-clustered indexes:
dbcc updateusage(0,'t') with count_rows
select rowcnt from sysindexes where id = object_id('tablename') and indid = 0 --assuming table is a heap
Run following to fix errors with non-clustered indexes:
dbcc updateusage(0,'t') with count_rows
n-th largest/smallest value in a table
SELECT *
FROM Employee E1
WHERE (N-1) = (SELECT COUNT(DISTINCT(E2.Salary))
FROM Employee E2
WHERE E2.Salary > E1.Salary )
Change > TO < for Smallest Number
FROM Employee E1
WHERE (N-1) = (SELECT COUNT(DISTINCT(E2.Salary))
FROM Employee E2
WHERE E2.Salary > E1.Salary )
Change > TO < for Smallest Number
SQL Memory
SQL Server memory is primarily used to store data (buffer cache) and query plans (procedure cache).
Thursday, August 23, 2007
Avg Index Fragmentation - SQL 2005
select a.index_id, name, avg_fragmentation_in_percent
from
sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'),OBJECT_ID('Production.Product'),NULL,NULL,NULL) as a
INNER JOIN sys.indexes as b
on a.object_id = b.object_id
and a.index_id = b.index_id
from
sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'),OBJECT_ID('Production.Product'),NULL,NULL,NULL) as a
INNER JOIN sys.indexes as b
on a.object_id = b.object_id
and a.index_id = b.index_id
Isolation level
Transactions specify an isolation level that defines the degree to which one transaction must be isolated from resource or data modifications made by other transactions.
A transaction always gets an exclusive lock on any data it modifies, and holds that lock until the transaction completes, regardless of the isolation level set for that transaction.
For read operations, transaction isolation levels primarily define the level of protection from the effects of modifications made by other transactions.
READ UNCOMMITTED
Specifies that statements can read rows that have been modified by other transactions but not yet committed.
Transactions running at the READ UNCOMMITTED level do not issue shared locks to prevent other transactions from modifying data read by the current transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current transaction from reading rows that have been modified but not committed by other transactions. When this option is set, it is possible to read uncommitted modifications, which are called dirty reads. Values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the isolation levels.
READ COMMITTED
Specifies that statements cannot read data that has been modified but not committed by other transactions. This prevents dirty reads. Data can be changed by other transactions between individual statements within the current transaction, resulting in nonrepeatable reads or phantom data. This option is the SQL Server default.
REPEATABLE READ
Specifies that statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes.
Shared locks are placed on all data read by each statement in the transaction and are held until the transaction completes. This prevents other transactions from modifying any rows that have been read by the current transaction. Other transactions can insert new rows that match the search conditions of statements issued by the current transaction. If the current transaction then retries the statement it will retrieve the new rows, which results in phantom reads. Because shared locks are held to the end of a transaction instead of being released at the end of each statement, concurrency is lower than the default READ COMMITTED isolation level. Use this option only when necessary.
SNAPSHOT
Specifies that data read by any statement in a transaction will be the transactionally consistent version of the data that existed at the start of the transaction. The transaction can only recognize data modifications that were committed before the start of the transaction. Data modifications made by other transactions after the start of the current transaction are not visible to statements executing in the current transaction. The effect is as if the statements in a transaction get a snapshot of the committed data as it existed at the start of the transaction.
Except when a database is being recovered, SNAPSHOT transactions do not request locks when reading data. SNAPSHOT transactions reading data do not block other transactions from writing data. Transactions writing data do not block SNAPSHOT transactions from reading data.
SERIALIZABLE
Specifies that:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.
Range locks are placed in the range of key values that match the search conditions of each statement executed in a transaction. This blocks other transactions from updating or inserting any rows that would qualify for any of the statements executed by the current transaction. This means that if any of the statements in a transaction are executed a second time, they will read the same set of rows. The range locks are held until the transaction completes. This is the most restrictive of the isolation levels because it locks entire ranges of keys and holds the locks until the transaction completes. Because concurrency is lower, use this option only when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT statements in a transaction.
Concurrency control theory has two classifications for the methods of instituting concurrency control:
Pessimistic concurrency control
A system of locks prevents users from modifying data in a way that affects other users. After a user performs an action that causes a lock to be applied, other users cannot perform actions that would conflict with the lock until the owner releases it. This is called pessimistic control because it is mainly used in environments where there is high contention for data, where the cost of protecting data with locks is less than the cost of rolling back transactions if concurrency conflicts occur.
Optimistic concurrency control
In optimistic concurrency control, users do not lock data when they read it. When a user updates data, the system checks to see if another user changed the data after it was read. If another user updated the data, an error is raised. Typically, the user receiving the error rolls back the transaction and starts over. This is called optimistic because it is mainly used in environments where there is low contention for data, and where the cost of occasionally rolling back a transaction outweighs the costs of locking data when read.
If a data storage system has no concurrency control, users could see the following side effects:
Lost updates.
Lost updates occur when two or more transactions select the same row and then update the row based on the value originally selected. Each transaction is unaware of the other transactions. The last update overwrites updates made by the other transactions, which results in lost data.
Uncommitted dependency (dirty read).
Uncommitted dependency occurs when a second transaction selects a row that is being updated by another transaction. The second transaction is reading data that has not been committed yet and may be changed by the transaction updating the row.
Inconsistent analysis (nonrepeatable read).
Inconsistent analysis occurs when a second transaction accesses the same row several times and reads different data each time. Inconsistent analysis is similar to uncommitted dependency in that another transaction is changing the data that a second transaction is reading. However, in inconsistent analysis, the data read by the second transaction was committed by the transaction that made the change. Also, inconsistent analysis involves multiple reads (two or more) of the same row, and each time the information is changed by another transaction; thus, the term nonrepeatable read.
Phantom reads.
Phantom reads occur when an insert or delete action is performed against a row that belongs to a range of rows being read by a transaction. The transaction's first read of the range of rows shows a row that no longer exists in the second or succeeding read as a result of a deletion by a different transaction. Similarly, the transaction's second or succeeding read shows a row that did not exist in the original read as the result of an insertion by a different transaction.
The following table shows the concurrency side effects allowed by the different isolation levels.
Isolation level Dirty read Nonrepeatable read Phantom
Read uncommitted
Yes
Yes
Yes
Read committed
No
Yes
Yes
Repeatable read
No
No
Yes
Snapshot
No
No
No
Serializable
No
No
No
A transaction always gets an exclusive lock on any data it modifies, and holds that lock until the transaction completes, regardless of the isolation level set for that transaction.
For read operations, transaction isolation levels primarily define the level of protection from the effects of modifications made by other transactions.
READ UNCOMMITTED
Specifies that statements can read rows that have been modified by other transactions but not yet committed.
Transactions running at the READ UNCOMMITTED level do not issue shared locks to prevent other transactions from modifying data read by the current transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current transaction from reading rows that have been modified but not committed by other transactions. When this option is set, it is possible to read uncommitted modifications, which are called dirty reads. Values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the isolation levels.
READ COMMITTED
Specifies that statements cannot read data that has been modified but not committed by other transactions. This prevents dirty reads. Data can be changed by other transactions between individual statements within the current transaction, resulting in nonrepeatable reads or phantom data. This option is the SQL Server default.
REPEATABLE READ
Specifies that statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes.
Shared locks are placed on all data read by each statement in the transaction and are held until the transaction completes. This prevents other transactions from modifying any rows that have been read by the current transaction. Other transactions can insert new rows that match the search conditions of statements issued by the current transaction. If the current transaction then retries the statement it will retrieve the new rows, which results in phantom reads. Because shared locks are held to the end of a transaction instead of being released at the end of each statement, concurrency is lower than the default READ COMMITTED isolation level. Use this option only when necessary.
SNAPSHOT
Specifies that data read by any statement in a transaction will be the transactionally consistent version of the data that existed at the start of the transaction. The transaction can only recognize data modifications that were committed before the start of the transaction. Data modifications made by other transactions after the start of the current transaction are not visible to statements executing in the current transaction. The effect is as if the statements in a transaction get a snapshot of the committed data as it existed at the start of the transaction.
Except when a database is being recovered, SNAPSHOT transactions do not request locks when reading data. SNAPSHOT transactions reading data do not block other transactions from writing data. Transactions writing data do not block SNAPSHOT transactions from reading data.
SERIALIZABLE
Specifies that:
Statements cannot read data that has been modified but not yet committed by other transactions.
No other transactions can modify data that has been read by the current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.
Range locks are placed in the range of key values that match the search conditions of each statement executed in a transaction. This blocks other transactions from updating or inserting any rows that would qualify for any of the statements executed by the current transaction. This means that if any of the statements in a transaction are executed a second time, they will read the same set of rows. The range locks are held until the transaction completes. This is the most restrictive of the isolation levels because it locks entire ranges of keys and holds the locks until the transaction completes. Because concurrency is lower, use this option only when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT statements in a transaction.
Concurrency control theory has two classifications for the methods of instituting concurrency control:
Pessimistic concurrency control
A system of locks prevents users from modifying data in a way that affects other users. After a user performs an action that causes a lock to be applied, other users cannot perform actions that would conflict with the lock until the owner releases it. This is called pessimistic control because it is mainly used in environments where there is high contention for data, where the cost of protecting data with locks is less than the cost of rolling back transactions if concurrency conflicts occur.
Optimistic concurrency control
In optimistic concurrency control, users do not lock data when they read it. When a user updates data, the system checks to see if another user changed the data after it was read. If another user updated the data, an error is raised. Typically, the user receiving the error rolls back the transaction and starts over. This is called optimistic because it is mainly used in environments where there is low contention for data, and where the cost of occasionally rolling back a transaction outweighs the costs of locking data when read.
If a data storage system has no concurrency control, users could see the following side effects:
Lost updates.
Lost updates occur when two or more transactions select the same row and then update the row based on the value originally selected. Each transaction is unaware of the other transactions. The last update overwrites updates made by the other transactions, which results in lost data.
Uncommitted dependency (dirty read).
Uncommitted dependency occurs when a second transaction selects a row that is being updated by another transaction. The second transaction is reading data that has not been committed yet and may be changed by the transaction updating the row.
Inconsistent analysis (nonrepeatable read).
Inconsistent analysis occurs when a second transaction accesses the same row several times and reads different data each time. Inconsistent analysis is similar to uncommitted dependency in that another transaction is changing the data that a second transaction is reading. However, in inconsistent analysis, the data read by the second transaction was committed by the transaction that made the change. Also, inconsistent analysis involves multiple reads (two or more) of the same row, and each time the information is changed by another transaction; thus, the term nonrepeatable read.
Phantom reads.
Phantom reads occur when an insert or delete action is performed against a row that belongs to a range of rows being read by a transaction. The transaction's first read of the range of rows shows a row that no longer exists in the second or succeeding read as a result of a deletion by a different transaction. Similarly, the transaction's second or succeeding read shows a row that did not exist in the original read as the result of an insertion by a different transaction.
The following table shows the concurrency side effects allowed by the different isolation levels.
Isolation level Dirty read Nonrepeatable read Phantom
Read uncommitted
Yes
Yes
Yes
Read committed
No
Yes
Yes
Repeatable read
No
No
Yes
Snapshot
No
No
No
Serializable
No
No
No
Query Plan Joins
SQL Server employs three types of join operations:
Nested loops joins
If one join input is small (fewer than 10 rows) and the other join input is fairly large and indexed on its join columns, an index nested loops join is the fastest join operation because they require the least I/O and the fewest comparisons.
Also called nested iteration, uses one join input as the outer input table (shown as the top input in the graphical execution plan) and one as the inner (bottom) input table. The outer loop consumes the outer input table row by row. The inner loop, executed for each outer row, searches for matching rows in the inner input table.
Effective if the outer input is small and the inner input is preindexed and large.
Merge joins
If the two join inputs are not small but are sorted on their join column (for example, if they were obtained by scanning sorted indexes), a merge join is the fastest join operation. If both join inputs are large and the two inputs are of similar sizes, a merge join with prior sorting and a hash join offer similar performance. However, hash join operations are often much faster if the two input sizes differ significantly from each other.
The merge join requires both inputs to be sorted on the merge columns, which are defined by the equality (ON) clauses of the join predicate. The query optimizer typically scans an index, if one exists on the proper set of columns, or it places a sort operator below the merge join.
Because each input is sorted, the Merge Join operator gets a row from each input and compares them. For example, for inner join operations, the rows are returned if they are equal. If they are not equal, the lower-value row is discarded and another row is obtained from that input. This process repeats until all rows have been processed.
can be an expensive choice if sort operations are required. However, if the data volume is large and the desired data can be obtained presorted from existing B-tree indexes, merge join is often the fastest available join algorithm.
Hash joins
Hash joins can efficiently process large, unsorted, nonindexed inputs. They are useful for intermediate results in complex queries because:
Intermediate results are not indexed (unless explicitly saved to disk and then indexed) and often are not suitably sorted for the next operation in the query plan.
Query optimizers estimate only intermediate result sizes. Because estimates can be very inaccurate for complex queries, algorithms to process intermediate results not only must be efficient, but also must degrade gracefully if an intermediate result turns out to be much larger than anticipated.
The hash join has two inputs: the build input and probe input. The query optimizer assigns these roles so that the smaller of the two inputs is the build input.
In-Memory Hash Join
The hash join first scans or computes the entire build input and then builds a hash table in memory. Each row is inserted into a hash bucket depending on the hash value computed for the hash key. If the entire build input is smaller than the available memory, all rows can be inserted into the hash table. This build phase is followed by the probe phase. The entire probe input is scanned or computed one row at a time, and for each probe row, the hash key's value is computed, the corresponding hash bucket is scanned, and the matches are produced.
Grace Hash Join
If the build input does not fit in memory, a hash join proceeds in several steps. This is known as a grace hash join. Each step has a build phase and probe phase. Initially, the entire build and probe inputs are consumed and partitioned (using a hash function on the hash keys) into multiple files. Using the hash function on the hash keys guarantees that any two joining records must be in the same pair of files. Therefore, the task of joining two large inputs has been reduced to multiple, but smaller, instances of the same tasks. The hash join is then applied to each pair of partitioned files.
Recursive Hash Join
If the build input is so large that inputs for a standard external merge would require multiple merge levels, multiple partitioning steps and multiple partitioning levels are required. If only some of the partitions are large, additional partitioning steps are used for only those specific partitions. In order to make all partitioning steps as fast as possible, large, asynchronous I/O operations are used so that a single thread can keep multiple disk drives busy.
Nested loops joins
If one join input is small (fewer than 10 rows) and the other join input is fairly large and indexed on its join columns, an index nested loops join is the fastest join operation because they require the least I/O and the fewest comparisons.
Also called nested iteration, uses one join input as the outer input table (shown as the top input in the graphical execution plan) and one as the inner (bottom) input table. The outer loop consumes the outer input table row by row. The inner loop, executed for each outer row, searches for matching rows in the inner input table.
Effective if the outer input is small and the inner input is preindexed and large.
Merge joins
If the two join inputs are not small but are sorted on their join column (for example, if they were obtained by scanning sorted indexes), a merge join is the fastest join operation. If both join inputs are large and the two inputs are of similar sizes, a merge join with prior sorting and a hash join offer similar performance. However, hash join operations are often much faster if the two input sizes differ significantly from each other.
The merge join requires both inputs to be sorted on the merge columns, which are defined by the equality (ON) clauses of the join predicate. The query optimizer typically scans an index, if one exists on the proper set of columns, or it places a sort operator below the merge join.
Because each input is sorted, the Merge Join operator gets a row from each input and compares them. For example, for inner join operations, the rows are returned if they are equal. If they are not equal, the lower-value row is discarded and another row is obtained from that input. This process repeats until all rows have been processed.
can be an expensive choice if sort operations are required. However, if the data volume is large and the desired data can be obtained presorted from existing B-tree indexes, merge join is often the fastest available join algorithm.
Hash joins
Hash joins can efficiently process large, unsorted, nonindexed inputs. They are useful for intermediate results in complex queries because:
Intermediate results are not indexed (unless explicitly saved to disk and then indexed) and often are not suitably sorted for the next operation in the query plan.
Query optimizers estimate only intermediate result sizes. Because estimates can be very inaccurate for complex queries, algorithms to process intermediate results not only must be efficient, but also must degrade gracefully if an intermediate result turns out to be much larger than anticipated.
The hash join has two inputs: the build input and probe input. The query optimizer assigns these roles so that the smaller of the two inputs is the build input.
In-Memory Hash Join
The hash join first scans or computes the entire build input and then builds a hash table in memory. Each row is inserted into a hash bucket depending on the hash value computed for the hash key. If the entire build input is smaller than the available memory, all rows can be inserted into the hash table. This build phase is followed by the probe phase. The entire probe input is scanned or computed one row at a time, and for each probe row, the hash key's value is computed, the corresponding hash bucket is scanned, and the matches are produced.
Grace Hash Join
If the build input does not fit in memory, a hash join proceeds in several steps. This is known as a grace hash join. Each step has a build phase and probe phase. Initially, the entire build and probe inputs are consumed and partitioned (using a hash function on the hash keys) into multiple files. Using the hash function on the hash keys guarantees that any two joining records must be in the same pair of files. Therefore, the task of joining two large inputs has been reduced to multiple, but smaller, instances of the same tasks. The hash join is then applied to each pair of partitioned files.
Recursive Hash Join
If the build input is so large that inputs for a standard external merge would require multiple merge levels, multiple partitioning steps and multiple partitioning levels are required. If only some of the partitions are large, additional partitioning steps are used for only those specific partitions. In order to make all partitioning steps as fast as possible, large, asynchronous I/O operations are used so that a single thread can keep multiple disk drives busy.
Wednesday, August 22, 2007
Join Story
There are three types of joins: inner, outer, and cross.
In addition, there are three types of outer joins: left, right, and full.
Inner Joins
In an inner join, records from two tables are combined and added to a query's results only if the values of the joined fields meet certain specified criteria. If you use an inner join to combine the authors and publishers tables based on their city and state columns, the result would be a list of all authors who live in the same city as a publisher:
USE pubs
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors AS a INNER JOIN publishers AS p
ON a.city = p.city
AND a.state = p.state
ORDER BY a.au_lname ASC, a.au_fname ASC
au_fname au_lname pub_name
Abraham Bennet Algodata Infosystems
Cheryl Carson Algodata Infosystems
Outer Joins
An outer join returns all rows from the joined tables whether or not there's a matching row between them. The ON clause supplements the data rather than filtering it. The three types of outer joins, left, right, and full, indicate the main data's source.
Left Outer Joins
When you use a left outer join to combine two tables, all the rows from the left-hand table are included in the results. So, for the authors and publishers tables, the result will include a list of all authors along with a publisher's name column. If a publisher exists in the author's city, it's listed. Otherwise, the field in the publisher's column is set to NULL:
USE pubs
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a LEFT OUTER JOIN publishers p
ON a.city = p.city
ORDER BY p.pub_name ASC, a.au_lname ASC, a.au_fname ASC
au_fname au_lname pub_name
Reginald Blotchet-Halls NULL
Michel DeFrance NULL
Innes del Castillo NULL
Ann Dull NULL
Marjorie Green NULL
Morningstar Greene NULL
Burt Gringlesby NULL
Sheryl Hunter NULL
Livia Karsen NULL
Charlene Locksley NULL
Stearns MacFeather NULL
Heather McBadden NULL
Michael O'Leary NULL
Sylvia Panteley NULL
Albert Ringer NULL
Anne Ringer NULL
Meander Smith NULL
Dean Straight NULL
Dirk Stringer NULL
Johnson White NULL
Akiko Yokomoto NULL
Abraham Bennet Algodata Infosystems
Cheryl Carson Algodata Infosystems
Right Outer Joins
A right outer join is conceptually the same as a left outer join except that all the rows from the right-hand table are included in the results. They can be included more than once, if more than one author exists in the publisher's city. If no author lives in the publisher's city, the author name fields are set to NULL:
USE pubs
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors AS a RIGHT OUTER JOIN publishers AS p
ON a.city = p.city
ORDER BY p.pub_name ASC, a.au_lname ASC, a.au_fname ASC
au_fname au_lname pub_name
Abraham Bennet Algodata Infosystems
Cheryl Carson Algodata Infosystems
NULL NULL Binnet & Hardley
NULL NULL Five Lakes Publishing
NULL NULL GGG&G
NULL NULL Lucerne Publishing
NULL NULL New Moon Books
NULL NULL Ramona Publishers
NULL NULL Scootney Books
Full Outer Join
As you might have gathered, a full outer join retrieves all the rows from both joined tables. It returns all of the paired rows where the join condition is true, plus the unpaired rows from each table concatenated with NULL rows from the other table. You usually won't want to use one of these.
Cross Join
A cross join returns not the sum but the product of two tables. Each row in the left-hand table is matched up with each row in the right-hand table. It's the set of all possible row combinations, without any filtering, as shown here:
USE pubs
SELECT au_fname, au_lname, pub_name
FROM authors CROSS JOIN publishers
ORDER BY au_lname DESC
The resultset contains 184 rows (authors has 23 rows, and publishers has 8; therefore, 23 × 8 = 184).
Top 11 are:
au_fname au_lname pub_name
Akiko Yokomoto New Moon Books
Akiko Yokomoto Binnet & Hardley
Akiko Yokomoto Algodata Infosystems
Akiko Yokomoto Five Lakes Publishing
Akiko Yokomoto Ramona Publishers
Akiko Yokomoto GGG&G
Akiko Yokomoto Scootney Books
Akiko Yokomoto Lucerne Publishing
Johnson White New Moon Books
Johnson White Binnet & Hardley
Johnson White Algodata Infosystems
However, if you add a WHERE clause (like WHERE authors.city = publishers.city), a cross join functions as an inner join—it uses the condition to filter all possible row combinations down to the ones you want:
USE pubs
SELECT au_fname, au_lname, pub_name
FROM authors CROSS JOIN publishers
WHERE authors.city = publishers.city
ORDER BY au_lname DESC
au_fname au_lname pub_name
Cheryl Carson Algodata Infosystems
Abraham Bennet Algodata Infosystems
In addition, there are three types of outer joins: left, right, and full.
Inner Joins
In an inner join, records from two tables are combined and added to a query's results only if the values of the joined fields meet certain specified criteria. If you use an inner join to combine the authors and publishers tables based on their city and state columns, the result would be a list of all authors who live in the same city as a publisher:
USE pubs
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors AS a INNER JOIN publishers AS p
ON a.city = p.city
AND a.state = p.state
ORDER BY a.au_lname ASC, a.au_fname ASC
au_fname au_lname pub_name
Abraham Bennet Algodata Infosystems
Cheryl Carson Algodata Infosystems
Outer Joins
An outer join returns all rows from the joined tables whether or not there's a matching row between them. The ON clause supplements the data rather than filtering it. The three types of outer joins, left, right, and full, indicate the main data's source.
Left Outer Joins
When you use a left outer join to combine two tables, all the rows from the left-hand table are included in the results. So, for the authors and publishers tables, the result will include a list of all authors along with a publisher's name column. If a publisher exists in the author's city, it's listed. Otherwise, the field in the publisher's column is set to NULL:
USE pubs
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a LEFT OUTER JOIN publishers p
ON a.city = p.city
ORDER BY p.pub_name ASC, a.au_lname ASC, a.au_fname ASC
au_fname au_lname pub_name
Reginald Blotchet-Halls NULL
Michel DeFrance NULL
Innes del Castillo NULL
Ann Dull NULL
Marjorie Green NULL
Morningstar Greene NULL
Burt Gringlesby NULL
Sheryl Hunter NULL
Livia Karsen NULL
Charlene Locksley NULL
Stearns MacFeather NULL
Heather McBadden NULL
Michael O'Leary NULL
Sylvia Panteley NULL
Albert Ringer NULL
Anne Ringer NULL
Meander Smith NULL
Dean Straight NULL
Dirk Stringer NULL
Johnson White NULL
Akiko Yokomoto NULL
Abraham Bennet Algodata Infosystems
Cheryl Carson Algodata Infosystems
Right Outer Joins
A right outer join is conceptually the same as a left outer join except that all the rows from the right-hand table are included in the results. They can be included more than once, if more than one author exists in the publisher's city. If no author lives in the publisher's city, the author name fields are set to NULL:
USE pubs
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors AS a RIGHT OUTER JOIN publishers AS p
ON a.city = p.city
ORDER BY p.pub_name ASC, a.au_lname ASC, a.au_fname ASC
au_fname au_lname pub_name
Abraham Bennet Algodata Infosystems
Cheryl Carson Algodata Infosystems
NULL NULL Binnet & Hardley
NULL NULL Five Lakes Publishing
NULL NULL GGG&G
NULL NULL Lucerne Publishing
NULL NULL New Moon Books
NULL NULL Ramona Publishers
NULL NULL Scootney Books
Full Outer Join
As you might have gathered, a full outer join retrieves all the rows from both joined tables. It returns all of the paired rows where the join condition is true, plus the unpaired rows from each table concatenated with NULL rows from the other table. You usually won't want to use one of these.
Cross Join
A cross join returns not the sum but the product of two tables. Each row in the left-hand table is matched up with each row in the right-hand table. It's the set of all possible row combinations, without any filtering, as shown here:
USE pubs
SELECT au_fname, au_lname, pub_name
FROM authors CROSS JOIN publishers
ORDER BY au_lname DESC
The resultset contains 184 rows (authors has 23 rows, and publishers has 8; therefore, 23 × 8 = 184).
Top 11 are:
au_fname au_lname pub_name
Akiko Yokomoto New Moon Books
Akiko Yokomoto Binnet & Hardley
Akiko Yokomoto Algodata Infosystems
Akiko Yokomoto Five Lakes Publishing
Akiko Yokomoto Ramona Publishers
Akiko Yokomoto GGG&G
Akiko Yokomoto Scootney Books
Akiko Yokomoto Lucerne Publishing
Johnson White New Moon Books
Johnson White Binnet & Hardley
Johnson White Algodata Infosystems
However, if you add a WHERE clause (like WHERE authors.city = publishers.city), a cross join functions as an inner join—it uses the condition to filter all possible row combinations down to the ones you want:
USE pubs
SELECT au_fname, au_lname, pub_name
FROM authors CROSS JOIN publishers
WHERE authors.city = publishers.city
ORDER BY au_lname DESC
au_fname au_lname pub_name
Cheryl Carson Algodata Infosystems
Abraham Bennet Algodata Infosystems
Thursday, May 24, 2007
Experimenting with the 'sp_' thing...
Here is what I tried...
Created two SPs in the master db...one with a prefix 'sp_', another just a name like
create proc sp_a as select 'a1'
create proc b as select 'b1'
Both compiled in master db of SQL 2005 server. Needless to say it will work fine when called from master.
I tried caling them from a user db.
exec sp_a
exec b
It could only executed sp_a and threw Could not find stored procedure 'b'
The upshot is it only looks up the SPs prefixed with 'sp_' in the master.
Now a big surprise.... I executed
drop proc sp_a
from the user db and it WORKED!
Obviously it couldnt locate 'b' for dropping as it was looking in the current db only.
Next, I created the two procs only in the user db and though both worked within the user db, both failed from master db.
Then I ran in the master db
create proc sp_a as select 'a1'
create proc b as select 'b1'
and at the user db
create proc sp_a as select 'a2'
create proc b as select 'b2'
First I executed both at master db, result
a1
b1
Then at user db,
a2
b2
Well, it does look for the sp with prefix 'sp_' in the calling db first.
Now the funny part, I could actually run the statement drop proc sp_a
twice at the user db, first it drops the one at the user and then at the master.
To sum up, the procs with 'sp_' prefix are first looked into the current db and if not found then searched in the master db.
Created two SPs in the master db...one with a prefix 'sp_', another just a name like
create proc sp_a as select 'a1'
create proc b as select 'b1'
Both compiled in master db of SQL 2005 server. Needless to say it will work fine when called from master.
I tried caling them from a user db.
exec sp_a
exec b
It could only executed sp_a and threw Could not find stored procedure 'b'
The upshot is it only looks up the SPs prefixed with 'sp_' in the master.
Now a big surprise.... I executed
drop proc sp_a
from the user db and it WORKED!
Obviously it couldnt locate 'b' for dropping as it was looking in the current db only.
Next, I created the two procs only in the user db and though both worked within the user db, both failed from master db.
Then I ran in the master db
create proc sp_a as select 'a1'
create proc b as select 'b1'
and at the user db
create proc sp_a as select 'a2'
create proc b as select 'b2'
First I executed both at master db, result
a1
b1
Then at user db,
a2
b2
Well, it does look for the sp with prefix 'sp_' in the calling db first.
Now the funny part, I could actually run the statement drop proc sp_a
twice at the user db, first it drops the one at the user and then at the master.
To sum up, the procs with 'sp_' prefix are first looked into the current db and if not found then searched in the master db.
Subscribe to:
Comments (Atom)