Saturday, October 27, 2007

Analysing defragmentation

DBCC SHOWCONTIG(MyTable1)
GO

DBCC SHOWCONTIG scanning 'MyTable1' table...
Table: 'MyTable1' (1556968673); index ID: 1, database ID: 16
TABLE level scan performed.
- Pages Scanned................................: 18986
- Extents Scanned..............................: 2443
- Extent Switches..............................: 9238
- Avg. Pages per Extent........................: 7.8
- Scan Density [Best Count:Actual Count].......: 25.70% [2374:9239]
- Logical Scan Fragmentation ..................: 44.58%
- Extent Scan Fragmentation ...................: 87.07%
- Avg. Bytes Free per Page.....................: 1658.7
- Avg. Page Density (full).....................: 79.51%
DBCC execution completed. If DBCC printed error messages,
contact your system administrator.


* There were 18,986 pages examined to create the report. Those pages existed within 2,443 extents

* While examining the pages for fragmentation, the server had to switch extent locations 9,238 times.

The Scan Density restates this by indicating the percentage of all pages within all extents were contiguous. In an ideal environment, the density displayed would be close to 100.

* The Logical Scan Fragmentation and Extent Scan Fragmentation are indications of how well the indexes are stored within the system when a clustered index is present (and should be ignored for tables that do not have a clustered index).

In both cases, a number close to 0 is preferable.

* There are an average of 1659 bytes free per page, or that each page is 79.51% utilized. The closer that number gets to 100, the faster the database is able to read in records, since more records exist on a single page. However, this must be balanced with the cost of writing to the table. Since a page split will occur if a write is required on a page that is full, the overhead can be tremendous.


After DBCC INDEXDEFRAG/DBREINDEX


DBCC SHOWCONTIG scanning 'MyTable1' table...
Table: 'MyTable1' (1556968673); index ID: 1, database ID: 16
TABLE level scan performed.
- Pages Scanned................................: 15492
- Extents Scanned..............................: 1945
- Extent Switches..............................: 2363
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 81.94% [1937:2364]
- Logical Scan Fragmentation ..................: 15.43%
- Extent Scan Fragmentation ...................: 20.15%
- Avg. Bytes Free per Page.....................: 159.8
- Avg. Page Density (full).....................: 98.03%
DBCC execution completed. If DBCC printed error messages,
contact your system administrator.

Page it

We are most familiar with the data row. The row size is set only by the definition of the table that holds it (e.g. A table of addresses require more data per row then a table of class names). In SQL Server, a table may define a row as storing as little as 4 bytes to as much as 8060.This limit is set by the size of the data page, which stores up to 8,192 bytes (8 KB). The remaining 132 bytes are used by SQL Server to track other information under the covers. Although SQL Server is designed around 8 KB pages, the smallest unit of data that SQL Server can allocate is 64 KB. This is called an extent.

To store the data in a sorted order, as in a phone book, SQL Server uses something called a clustered index. When a typical database is created, clustered indexes exist on nearly all tables. However, just because the data exists in sorted order within the page does not mean that it exists as such within an extent. The reason for this derives from situations in which there is no more room on a given page in which it can insert a row. SQL Server then removes approximately half the page and moves it to another page, which is called a Page Split (Page Splits will not occur with clustered indexes on IDENTITY based columns, but hotspotting may). In some cases, it may move that data to another extent altogether, possibly even allocating a new extent to do so. So, while we start off with names beginning with A and ending with H on one page, and names beginning with I and ending with Z on the next page, through usage, we may see that names A through C are now located on one page in one extent, D through E on another extent and S through Z back on the fifth page of the first extent, etc. It is because of the page split that there are times in which we may prefer to use tables with no clustered indexes at all. However, these tables are usually scratch tables which are highly volatile. In those situations, we desire the quicker write times at the cost of slower reads.

The row count question

select rowcnt from sysindexes where id = object_id('tablename') and indid = 1 --assuming there exist a clustered index

select rowcnt from sysindexes where id = object_id('tablename') and indid = 0 --assuming table is a heap

Run following to fix errors with non-clustered indexes:
dbcc updateusage(0,'tablename') with count_rows

Query plan?

A query plan details how the SQL Query Optimizer will run the query. The optimizer plans which portion of the query to execute first and which index to use. Depending on the way the tables are set up, the nature of the query, the indexes available, etc. SQL Server will determine the optimal technique - be it to scan through the table from start to finish, to use one or more specific indexes or various other ways.

SET SHOWPLAN_TEXT ON – TSQL, Include Actual Execution Plan - SSMS

The query plan is stored in the SQL Server memory’s Procedure Cache.

Much ado about Stored procedures

1. Stored procedure execution plans can be reused, staying cached in SQL Server's memory, reducing server overhead.
2. Reduce network traffic
3. Provide better security to your data. If you use stored procedures exclusively, you can remove direct SELECT, INSERT, UPDATE, and DELETE rights from the tables and force developers to use stored procedures as the method for data access.
4. Encapsulate business logic or decision making processes


Whenever a stored procedure is executed for the first time, its execution plan is compiled and stored in the SQL Server’s procedure cache. For any subsequent run of the stored procedure, it doesn’t need to compile it again as it is already pre-compiled. This is unlike any embedded statement in the application that needs to be compiled each time it is fired.

Indexes, anyone?

An Index stores the location of the row being looked up. It speeds up data retrieval as the engine does not have to perform a complete ‘table scan’ to find the row being queried.

A clustered index physically sorts the table upon the member columns. A non-clustered index stores the key value and the row index where to find that row.

A nice example to explain this difference is a book index that stores terms with page numbers and behaves like non-clustered index; while a phone book stores names and numbers – if the number is stored with the name – behaves like a clustered index.

Little on Temporary tables

Temporary tables are used to hold temporary data for the connection that creates the table. These tables get created in the tempdb system database.

There is one more interesting difference from a permanent – a foreign key cannot be defined on a temporary table.

The temp table exists only for the connection that creates it. No other connection can see the same temp table.


Difference between temporary tables and table variables:


1. Table variables are transaction neutral, i.e., transaction logs are not maintained for changes to temporary variables
2. Table variables exist only in the same scope as variables. Contrary to the temporary tables, they are not visible to inner stored procedures and exec (string) statements. Also, they cannot be used in an insert/exec statement.


create table #T (s varchar(128))
declare @T table (s varchar(128))

insert into #T select 'old value #'
insert into @T select 'old value @'
begin transaction
update #T set s='new value #'
update @T set s='new value @'
rollback transaction

select * from #T
select * from @T

s
---------------
old value #

s
---------------
new value @