A proper index can
improve the performance and a bad index can choke the performance.
One of the most important routes to high performance in a SQL
Server database is the index. Indexes speed up the querying process by
providing swift access to rows in the data tables.
Indexes are created on columns in tables or views. The index
provides a fast way to look up data based on the values within those columns.
Create indexes on most columns in a table or a view. The
exceptions are primarily those columns configured with large object (LOB) data
types, such as image, text, and varchar(max).
Note: A table that has a clustered index is referred to as a clustered
table. A table that has no clustered index is referred to as a heap.
Clustered Indexes
A clustered index
stores the actual data rows at the leaf level of the index.
Nonclustered Indexes
Unlike a clustered
indexed, the leaf nodes of a nonclustered index contain only the values from
the indexed columns and row locators that point to the actual data rows, rather
than contain the data rows themselves. This means that the query engine must
take an additional step in order to locate the actual data.
A proper index can
improve the performance and a bad index can choke the performance.
Observation of Clustered Index over Nonclustered Index
We have one query which is not using any index. On the same
table there is already nonclustered index created, which is also not being
used. Now when we created clustered index on the same table, our query suddenly
started to use nonclustered index which was so far it is not using. The
interesting part of this is that query is using nonclustered index when
clustered index is created on the same.
/* */
CREATE TABLE
[dbo].[Tab_Cls_NonCls](
[ID] [int] NOT
NULL,
[First] [nchar](10) NULL,
[Second] [nchar](10) NULL
) ON [PRIMARY]
GO
/* Create Sample Table */
INSERT INTO
[dbo].[Tab_Cls_NonCls]
([ID],[First],[Second])
SELECT 1,'First1','Second1'
UNION ALL
SELECT 2,'First2','Second2'
UNION ALL
SELECT 3,'First3','Second3'
UNION ALL
SELECT 4,'First4','Second4'
UNION ALL
SELECT 5,'First5','Second5'
GO
/* Create Nonclustered Index
over Table */
CREATE NONCLUSTERED
INDEX [IX_Tab_Cls_NonCls_NonClustered]
ON [dbo].[Tab_Cls_NonCls]
(
[First] ASC,
[Second] ASC
) ON
[PRIMARY]
GO
/* Run following two queries
together and observe the
result in by Enabling Actual
Execution Plan (CTRL + M)
1st Query will use Table
Scan
2nd Query will use Index
Seek
*/
SELECT ID
FROM [Tab_Cls_NonCls]
WHERE First = 'First1' AND Second = 'Second1'
SELECT Second
FROM [Tab_Cls_NonCls]
WHERE First = 'First1' AND Second = 'Second1'
GO
It is clear from query
that index applies to columns on which it is created. In our case as in WHERE
condition we have same columns which are used in Index.
Query 1 – Does not use any
index
Query 2 – Does
nonclustered index seek
/* Create Clustered Index
over Table */
CREATE CLUSTERED
INDEX [IX_Tab_Cls_NonCls_Clustered]
ON [dbo].[Tab_Cls_NonCls]
(
[ID] ASC
) ON
[PRIMARY]
GO
--Once again run above two
same query and see the execution plan.
/* Run following two queries
together and observe the
result in 1st Query will use
Index Seek
2nd Query will use Index
Seek
*/
SELECT ID
FROM [Tab_Cls_NonCls]
WHERE First = 'First1' AND Second = 'Second1'
SELECT Second
FROM [Tab_Cls_NonCls]
WHERE First = 'First1' AND Second = 'Second1'
GO
Query 2 – Does nonclustered index seek
Let us go over our steps of whole exercise.
Step 1: We have one table and one nonclustered index.
Step 2: We ran Query 1 which does not use nonclustered index.
Step 3: We created clustered index over table.
Step 4: We ran Query 1 which now use nonclustered index.
Step 1: We have one table and one nonclustered index.
Step 2: We ran Query 1 which does not use nonclustered index.
Step 3: We created clustered index over table.
Step 4: We ran Query 1 which now use nonclustered index.
The question is why this has happened? If Query can use
nonclustered index why did it has to wait for clustered index to be created?
The
reason for this is that every nonclustered index refers to clustered index
internally. When clustered index is created on table it reorganizes the table
in the physical order of the clustered index. When there is no clustered index
created on table at that time all nonclustered index points to data in the
table to retrieve the data, however once clustered index is created all the
nonclustered indexes are reorganized and they point to clustered index. This
effect is creating index seek operation on nonclustered index. In our example column
on which clustered index is created is in SELECT clause and WHERE clause
contains columns which are used in nonclustered index, which is creating the
effect which we have observed.
Create Index on frequently used columns in T-SQL Code. Columns
used in WHERE, ORDER BY and GROUP BY are good candidate for Indexes. Create
Index on columns which are used in JOIN Condition.
CREATE NONCLUSTERED
INDEX idx_EID ON
EMP(ID)
CREATE CLUSTERED
INDEX idx_DID ON
EMP(DID)
Remove any un-necessary Indexes. As Index occupies hard
drive space as well as it decreases performance of all the insert, updates,
deletes to the table.
--
Unused Index Script
--
Original Author: Pinal Dave (C) 2011
SELECT TOP 25
o.name AS ObjectName
, i.name AS IndexName
, i.index_id
AS IndexID
, dm_ius.user_seeks
AS UserSeek
, dm_ius.user_scans
AS UserScans
, dm_ius.user_lookups
AS UserLookups
, dm_ius.user_updates
AS UserUpdates
, p.TableRows
, 'DROP INDEX '
+ QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(s.name) + '.' + QUOTENAME(OBJECT_NAME(dm_ius.OBJECT_ID)) AS 'drop statement'
FROM sys.dm_db_index_usage_stats
dm_ius
INNER JOIN sys.indexes i ON i.index_id = dm_ius.index_id AND dm_ius.OBJECT_ID = i.OBJECT_ID
INNER JOIN sys.objects o ON dm_ius.OBJECT_ID = o.OBJECT_ID
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
INNER JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.OBJECT_ID
FROM sys.partitions p GROUP BY p.index_id, p.OBJECT_ID) p
ON p.index_id
= dm_ius.index_id
AND dm_ius.OBJECT_ID = p.OBJECT_ID
WHERE OBJECTPROPERTY(dm_ius.OBJECT_ID,'IsUserTable') = 1
AND dm_ius.database_id
= DB_ID()
AND i.type_desc
= 'nonclustered'
AND i.is_primary_key
= 0
AND i.is_unique_constraint
= 0
ORDER BY (dm_ius.user_seeks + dm_ius.user_scans + dm_ius.user_lookups) ASC
SELECT object_name(i.object_id) as tableName, i.name as indexName
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats s ON
i.object_id = s.object_id AND i.index_id = s.index_id AND s.database_id = db_id()
WHERE objectproperty(i.object_id,'IsUserTable') = 1 and i.index_id> 0
AND s.object_id IS NULL
AND i.is_Primary_Key
= 0
AND i.is_unique_constraint
= 0
AND i.is_unique
= 0
DECLARE @dbid INT
SELECT @dbid =
DB_ID(DB_NAME())
SELECT OBJECTNAME =
OBJECT_NAME(I.OBJECT_ID),
INDEXNAME = I.NAME,
I.INDEX_ID
FROM SYS.INDEXES I
JOIN SYS.OBJECTS O
ON I.OBJECT_ID = O.OBJECT_ID
WHERE OBJECTPROPERTY(O.OBJECT_ID,'IsUserTable') = 1
AND I.INDEX_ID
NOT IN (
SELECT S.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS
S
WHERE S.OBJECT_ID = I.OBJECT_ID
AND I.INDEX_ID
= S.INDEX_ID
AND DATABASE_ID =
@dbid)
AND I.INDEX_ID
> 0
ORDER BY
OBJECTNAME,
I.INDEX_ID,INDEXNAME ASC
Smaller Index Key gives better performance than Index
key which covers large data or many columns
Index on Integer Columns performs better than varchar
columns.
Clustered Index must exist before creating Non-Clustered
Index.
Clustered Index must be created on Single Column which
is not changing and narrow in size. Best candidate is primary key.
Non-clustered Indexes increases performance of the query
that returns fewer rows and rows has wide selectivity spectrum.
Each table must have one Clustered Index.
If column have low selectivity avoid creating Index on
that column as it slow down the rows modification and system will not get
benefit from Index.
Multiple Columns Index or Covered Index should be
ordered as Most Selective column on left and gradually decreasing selectivity
as they go right.
Use SORT_IN_TEMPDB option when table is created if tempdb is on
different disk. This will increase the performance to create Index.
SORT_IN_TEMPDB Option For Indexes:
The SORT_IN_TEMPDB option directs the index creation processing to the tempdb database, instead of the database in use.
Create or rebuild an index, by setting the SORT_IN_TEMPDB option to ON you can direct the SQL Server Database Engine to use tempdb to store the intermediate sort results that are used to build the index.
Note: Make sure a dedicated disk is allotted to the tempdb database and it has sufficient disk space. When using the SORT_IN_TEMPDB option, DBA’s usually calculate the disk space requirement while doing capacity planning.
CREATE NONCLUSTERED INDEX AddIndexNmHere ON TableNm (ColumnName) WITH (SORT_IN_TEMPDB = ON)
The Database Engine first scans the data pages of the base table to retrieve key values and builds an index leaf row for each data row.
When SORT_IN_TEMPDB is set to OFF, the
default, the sort runs are stored in the destination filegroup.
If a sort operation is not required or if
the sort can be performed in memory, the SORT_IN_TEMPDB option is ignored.
If SORT_IN_TEMPDB is set to ON, there must be sufficient free
space in tempdb to
store the sort runs, and sufficient free space in the destination filegroup to
store the final index structure. The sort runs contain the leaf rows of the
index.
If SORT_IN_TEMPDB is set to OFF, the free space in the
destination filegroup must be large enough to store the final index structure.
The continuity of the index extends may be improved if more free space is
available.
Rebuild Index frequently using ALTER INDEX and
De-fragment Index to keep performance optimal for Indexes.
Filtered Index
Filtered Index is a new feature in SQL SERVER 2008. Filtered Index is used to
index a portion of rows in a table that means it applies filter on INDEX which
improves query performance, reduce index maintenance costs, and reduce index
storage costs compared with full-table indexes.
When
we see an Index created with some WHERE clause then that is actually a FILTERED
INDEX.
Points to remember when creating Filtered Index:
·
They
can be created only as Nonclustered Index.
·
They
can be used on Views only if they are persisted views.
·
They
cannot be created on full-text Indexes.
·
Update Query execute more faster if where clause have filtered
index
CREATE NONCLUSTERED
INDEX idx_NAME ON
COUNTRY (NAME) WHERE COUNTRY_ID='INDIA'
-- Unused Index Script
-- Original Author: Pinal Dave (C) 2011
SELECT TOP 25
o.name AS ObjectName , i.name AS IndexName, i.index_id AS IndexID,
dm_ius.user_seeks AS UserSeek,dm_ius.user_scans AS UserScans, dm_ius.user_lookups AS UserLookups,
dm_ius.user_updates AS UserUpdates, p.TableRows, 'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(s.name) + '.' + QUOTENAME(OBJECT_NAME(dm_ius.OBJECT_ID)) AS 'drop statement'
FROM sys.dm_db_index_usage_stats dm_ius
INNER JOIN sys.indexes i ON i.index_id = dm_ius.index_id AND dm_ius.OBJECT_ID = i.OBJECT_ID
INNER JOIN sys.objects o ON dm_ius.OBJECT_ID = o.OBJECT_ID
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
INNER JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.OBJECT_ID
FROM sys.partitions p GROUP BY p.index_id, p.OBJECT_ID) p
ON p.index_id = dm_ius.index_id AND dm_ius.OBJECT_ID = p.OBJECT_ID
WHERE OBJECTPROPERTY(dm_ius.OBJECT_ID,'IsUserTable') = 1
AND dm_ius.database_id = DB_ID()
AND i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint
= 0
ORDER BY (dm_ius.user_seeks + dm_ius.user_scans + dm_ius.user_lookups) ASC
GO
Index Types
In addition to an
index being clustered or nonclustered, it can be configured in other ways:
- Composite index: An index that contains more than one column. In both SQL Server 2005 and 2008, you can include up to 16 columns in an index, as long as the index doesn’t exceed the 900-byte limit. Both clustered and nonclustered indexes can be composite indexes.
- Unique Index: An index that ensures the uniqueness of each value in the indexed column. If the index is a composite, the uniqueness is enforced across the columns as a whole, not on the individual columns. For example, if you were to create an index on the FirstName and LastName columns in a table, the names together must be unique, but the individual names can be duplicated.
A
unique index is automatically created when you define a primary key or unique
constraint:
- Primary key: When you define a primary key constraint on one or more columns, SQL Server automatically creates a unique, clustered index if a clustered index does not already exist on the table or view. However, you can override the default behavior and define a unique, nonclustered index on the primary key.
- Unique: When you define a unique constraint, SQL Server automatically creates a unique, nonclustered index. You can specify that a unique clustered index be created if a clustered index does not already exist on the table.
- Covering index: A type of index that includes all the columns that are needed to process a particular query. For example, your query might retrieve the FirstName and LastName columns from a table, based on a value in the ContactID column. You can create a covering index that includes all three columns.
Create Indexes with Included Columns
Describes: Add included (or nonkey) columns to extend the
functionality of nonclustered indexes. By including nonkey columns, you can
create nonclustered indexes that cover more queries.
This is because the nonkey columns have the following benefits:
They can be data types not allowed as index
key columns.
They
are not considered by the Database Engine when calculating the number of index
key columns or index key size.
Limitations and Restrictions
Nonkey columns can
only be defined on nonclustered indexes.
All data types except text,
ntext, and image can be used as nonkey columns.
Nonkey columns cannot be dropped from a
table unless that table’s index is dropped first.
Nonkey columns cannot be changed, except to
do the following:
Change the nullability of the column from
NOT NULL to NULL.
Increase the length of varchar,
nvarchar, or varbinary columns.
Permissions
Requires ALTER permission on the table or
view. User must be a member of the sysadmin fixed server role or
the db_ddladmin and db_owner fixed database roles.
-- Creates a nonclustered
index on the Person.Address table with four included (nonkey) columns.
-- index key column is
PostalCode and the nonkey columns are
-- AddressLine1,
AddressLine2, City, and StateProvinceID.
CREATE NONCLUSTERED
INDEX IX_Address_PostalCode
ON Person.Address (PostalCode)
INCLUDE (AddressLine1,
AddressLine2, City,
StateProvinceID);
Columns appearing in the WHERE, FROM, GROUP BY and ORDER BY should appear in the "main part" of the index.
Columns not otherwise included that appear in the SELECT or HAVING go in the UNCLUDE clause.
Technically - since the "main part" is what the B-tree is based on - by keeping the main part as skinny as you can, you get better perf, while avoiding the bookmark lookup with the INCLUDE stuff.
SELECT OBJECT_NAME(ind.OBJECT_ID) AS TableName,
ind.name
AS IndexName,
indexstats.index_type_desc AS IndexType,
indexstats.avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) indexstats
INNER JOIN sys.indexes ind
ON ind.object_id =
indexstats.object_id
AND ind.index_id
= indexstats.index_id
WHERE indexstats.avg_fragmentation_in_percent
> 30
ORDER BY
indexstats.avg_fragmentation_in_percent DESCFragmentation
Storing data non-contiguously on disk is known as fragmentation.
Internal
Fragmentation:
When records are stored non-contiguously inside the page, then
it is called internal fragmentation.
In other words, internal fragmentation is said to occur if there
is unused space between records in a page.
This fragmentation occurs through the process of data
modifications (INSERT, UPDATE, and DELETE statements) that are made against the
table and therefore, to the indexes defined on the table.
This unused space causes poor cache utilization and more I/O,
which ultimately leads to poor query performance.
External
Fragmentation:
When on disk, the physical storage of pages and extents is not
contiguous.
When the extents of a table are not physically stored
contiguously on disk, switching from one extent to another causes higher disk
rotations, and this is called Extent Fragmentation.
How to detect
Fragmentation: We can get both types of fragmentation using the DMV: sys.dm_db_index_physical_stats.
SELECT OBJECT_NAME(OBJECT_ID), index_id,index_type_desc,index_level,
avg_fragmentation_in_percent,avg_page_space_used_in_percent,page_count FROM sys.dm_db_index_physical_stats
(DB_ID(N'Adhoc_48hrsRpt'), NULL, NULL, NULL , 'SAMPLED')
ORDER BY
avg_fragmentation_in_percent DESC
Along with other information, there are two important columns
that for detecting fragmentation, which are as follows:
- avg_fragmentation_in_percent: This is a percentage value that represents external fragmentation. For a clustered table and leaf level of index pages, this is Logical fragmentation, while for heap, this is Extent fragmentation. The lower this value, the better it is. If this value is higher than 10%, some corrective action should be taken.
- avg_page_space_used_in_percent: This is an average percentage use of pages that represents to internal fragmentation. Higher the value, the better it is. If this value is lower than 75%, some corrective action should be taken.
Reducing fragmentation:
- Reducing Fragmentation in a Heap: To reduce the fragmentation of a heap, create a clustered index on the table. Creating the clustered index, rearrange the records in an order, and then place the pages contiguously on disk.
- Reducing Fragmentation in an Index: There are three choices for reducing fragmentation, and we can choose one according to the percentage of fragmentation:
- If avg_fragmentation_in_percent > 5% and < 30%, then use ALTER INDEX REORGANIZE: This statement is replacement for DBCC INDEXDEFRAG to reorder the leaf level pages of the index in a logical order. As this is an online operation, the index is available while the statement is running.
- If avg_fragmentation_in_percent > 30%, then use ALTER INDEX REBUILD: This is replacement for DBCC DBREINDEX to rebuild the index online or offline. In such case, we can also use the drop and re-create index method.
- (Update: Please note this option is strongly NOT recommended)Drop and re-create the clustered index: Re-creating a clustered index redistributes the data and results in full data pages. The level of fullness can be configured by using the FILLFACTOR option in CREATE INDEX.
-- Index fragmentation in a
database - QUICK SYNTAX
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID('user_DB'),
NULL,NULL,NULL,NULL);
USE user_DB;
-- SQL Server 2005/2008
Index Fragmentation Report
-- SQL Server DMV - Dynamic
Management View
SELECT schema_name(t.schema_id) AS [Schema],
object_name(ps.object_id) AS [Table],
i.name AS [Index],
ps.Index_type_desc AS IndexType,
convert(TINYINT,ps.avg_fragmentation_in_percent) AS [AvgFrag%],
convert(TINYINT,ps.avg_page_space_used_in_percent) AS [AvgSpaceUsed%],
ps.record_count AS RecordCnt,
ps.fragment_count AS FragmentCnt
FROM
sys.dm_db_index_physical_stats(db_id(db_name()),
NULL,NULL,NULL,'DETAILED') ps -- Faster option:
SAMPLED
INNER JOIN
sys.indexes i
ON ps.object_id = i.object_id
AND ps.index_id
= i.index_id
INNER JOIN
sys.tables t
ON ps.object_id = t.object_id
WHERE
t.is_ms_shipped =
0
ORDER BY
[Schema],
[Table],
[Index]
-- SQL Server 2005 and on
fragmentation queries based on sys.dm_db_index_physical_stats
USE user_DB;
GO
-- DETAILED fragmentation
info on table
SELECT DB=DB_NAME(database_id), [Object]=OBJECT_NAME (OBJECT_ID),
* FROM sys.dm_db_index_physical_stats
(DB_ID(N'AdventureWorks2008'),
OBJECT_ID(N'Sales.SalesOrderHeaderSalesReason'), NULL, NULL , 'DETAILED');
GO
-- Fragmentation info on
table
SELECT DB=DB_NAME(database_id), [Object]=OBJECT_NAME (OBJECT_ID),
* FROM sys.dm_db_index_physical_stats
(DB_ID(N'AdventureWorks2008'),
OBJECT_ID(N'Sales.SalesOrderHeaderSalesReason'), NULL, NULL , NULL);
GO
-- All fragmentation info
SELECT DB=DB_NAME(database_id), [Object]=OBJECT_NAME (OBJECT_ID),
* FROM sys.dm_db_index_physical_stats
(DB_ID(N'AdventureWorks2008'),
NULL, NULL,
NULL , NULL);
GO
-- SQL Server 2000/8 index
fragmentation for all indexes of a table
use user_DB
dbcc showcontig('dbo.drop_ADH_TRACE_OCT_16')
with all_indexes
------------
-- SQL Server 2000/8 version
- DBCC SHOWCONTIG for index fragmentation
------------
USE user_DB;
SELECT
'DBCC SHOWCONTIG (' + convert(VARCHAR,o.id) + ',' +
convert(VARCHAR,i.indid) + ') -- ' + object_name(o.id) + -- table name
'.' + i.name -- index name
FROM
sysobjects o
JOIN sysindexes
i
ON (o.id = i.id)
WHERE
o.xtype =
'U'
AND i.rows > 0
ORDER BY object_name(o.id),
i.indid
GO
Reference
defragmentation explained
ReplyDelete