Filtered Index in SQL Server 2008 for Quey optimization
SQL Server 2008 introduces an enhancement to the way the indexes are created which helps to get better performance from the same queries. This new is ‘Filtered Index’. This is actually an Optimized Non-Clustered index which is well suited to cover the subset of data with in a table.
Every organization has a specific data usage pattern and if we have an idea of the data usage pattern then we can create the filtered indexes to cover the subsets of data that can actually yield faster results from our same query / stored proc.
Let’s say the sample database “AdventureWorks” is used by a company that sells spare parts and they have different departments that sell specific products and lets say the racing bikes being the hottest in the market, it drives more sales for the racing bike spare parts. This means that more transactions are being done related to a specific product(s) so most of the times the data is filtered, is based on a specific Product Id. Moreover if there are millions of Rows in the table, it becomes really necessary to choose the indexes correctly and carefully because of the storage space involved to store them.
For example the queries (like the following one) are executed based on a selection criteria very frequently then the filtered indexes will help you a lot.
SELECT * FROM [AdventureWorks].[Purchasing].[PurchaseOrderDetail]
Where ProductID =709 OR ProductID = 879 OR ProductID = 877
Let see generally how indexes speed up the queries. During the SELECT, UPDATE, DELETE operations the SQL optimizer will look for indexes and determine the most efficient method to complete an operation. In this process SQL Optimizer determines to use an index or to go for a table scan. The SQL Optimizer does this operation to reduce the query execution time and improve the performance.
Index seeks are preferred for the selective queries like the one above and the index seeks happen only when the optimizer finds the index on the table useful, otherwise it will scan all the records. Generally, searching an index to spot a matching record is faster than a table scan because it is ordered and less number of columns means small set of data which in turn means less hops to find the match. Unique indexes are faster and table partitions speeds it up more. Well ….discussing the anatomy of indexes and strategy for indexing the tables is a huge subject and there are books available for that so …let me get back to the topic where I left it.
So why a filtered index ? A filtered index can be used as an index for a subset of data rather than the entire set of data. The filtered index takes less space to store the index because it stores only index information related to a subset of data unlike an usual non-clustered index. In no way this is a substitute for a regular Clustered and Non-Clustered indexes. This is an additional enhancement that can help you in tuning specific highly selective queries.
What are the advantages of Filtered Indexes :
Improved query performance and plan quality
A well-designed filtered index improves query performance and execution plan quality because it is smaller than a full-table non-clustered index and has filtered statistics. The filtered statistics are more accurate than full-table statistics because they cover only the rows in the filtered index.
Reduced index maintenance costs
An index is maintained only when data manipulation language (DML) statements affect the data in the index. A filtered index reduces index maintenance costs compared with a full-table non-clustered index because it is smaller and is only maintained when the data in the index is affected. It is possible to have a large number of filtered indexes, especially when they contain data that is affected infrequently. Similarly, if a filtered index contains only the frequently affected data, the smaller size of the index reduces the cost of updating the statistics.
Reduced index storage costs
Creating a filtered index can reduce disk storage for non-clustered indexes when a full-table index is not necessary. You can replace a full-table non-clustered index with multiple filtered indexes without significantly increasing the storage requirements.
You can also determine the space required for an index. Please visit : http://msdn.microsoft.com/en-us/library/ms191163.aspx
How to create a Filtered Index ?
Let’s see an example :
--Drop if the index already exists
IF EXISTS (SELECT name FROM sys.indexes
WHERE name = N'FNC_PurchaseOrderDetail_Bikes'
AND object_id = OBJECT_ID ('Purchasing.PurchaseOrderDetail'))
DROP INDEX FNC_PurchaseOrderDetail_Bikes
--Create a new index
CREATE NONCLUSTERED INDEX FNC_PurchaseOrderDetail_Bikes
WHERE ProductID IN(709, 879, 877)
So simple .. just add a WHERE clause to the index. You need to choose the WHERE clause carefully.
When to Use Filtered Indexes
Filtered indexes are useful when columns contain well-defined subsets of data that queries reference in SELECT statements. Examples are:
· Sparse columns that contain only a few non-NULL values.
· Heterogeneous columns that contain categories of data.
· Columns that contain ranges of values such as dollar amounts, time, and dates.
· Table partitions that are defined by simple comparison logic for column values.
Reduced maintenance costs for filtered indexes are most noticeable when the number of rows in the index is small compared with a full-table index. If the filtered index includes most of the rows in the table, it could cost more to maintain than a full-table index. In this case, you should use a full-table index instead of a filtered index.
Filtered indexes are defined on one table and only support simple comparison operators. If you need a filter expression that references multiple tables or has complex logic, you should create a view.