Reading about indexes I recently came across the term "selectivity". Some places said that more selective fields should come first in the index, others say no or that it depends.
I have the following table tbMovimentoConta
abbreviation ( mcon
) with 5 million records, and the following columns that will be used in where
- mcon_dataExclusion (99% of records have the value
null
) because the data can not be deleted withdelete from
- con_codigo_fk (there are 5 distinct values (1,2,3,4,5) divided to 5 million records)
- mcon_data (1825 different dates for the 5 million records)
I have the following query that is generated by the Entity Framework:
exec sp_executesql N'SELECT
[Project1].[C2] AS [C1],
[Project1].[mcon_codigo_pk] AS [mcon_codigo_pk],
[Project1].[mcon_data] AS [mcon_data],
[Project1].[mcon_obs] AS [mcon_obs],
[Project1].[mcon_valor] AS [mcon_valor],
[Project1].[mcon_tipo] AS [mcon_tipo]
FROM ( SELECT
convert (datetime2, convert(varchar(255), [Extent1].[mcon_data], 102) , 102) AS [C1],
[Extent1].[mcon_codigo_pk] AS [mcon_codigo_pk],
[Extent1].[mcon_data] AS [mcon_data],
[Extent1].[mcon_valor] AS [mcon_valor],
[Extent1].[mcon_tipo] AS [mcon_tipo],
[Extent1].[mcon_obs] AS [mcon_obs],
1 AS [C2]
FROM [dbo].[tbMovimentoConta] AS [Extent1]
WHERE ([Extent1].[mcon_data] >= @p__linq__0) AND ([Extent1].[mcon_data] <= @p__linq__1) AND ([Extent1].[con_codigo_fk] = @p__linq__2) AND ([Extent1].[mcon_dataExclusao] IS NULL)
) AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[mcon_codigo_pk] ASC
OFFSET 764225 ROWS FETCH NEXT 11 ROWS ONLY ',N'@p__linq__0 datetime2(7),@p__linq__1 datetime2(7),@p__linq__2 bigint',@p__linq__0='2012-01-01 00:00:00',@p__linq__1='2013-01-01 23:59:59',@p__linq__2=1
I tested with two index settings
Table of contents 1
CREATE NONCLUSTERED INDEX [ix_consulta_movimento] ON [dbo].[tbMovimentoConta]
(
[mcon_dataExclusao] ASC,
[con_codigo_fk] ASC,
[mcon_data] ASC
)
INCLUDE (
[mcon_dataInclusao],
[mcon_codigo_pk],
[mcon_valor],
[mcon_tipo],
[mcon_obs]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Contents 2
CREATE NONCLUSTERED INDEX [ix_consulta_movimento] ON [dbo].[tbMovimentoConta]
(
[mcon_data] ASC,
[con_codigo_fk] ASC,
[mcon_dataExclusao] ASC
)
INCLUDE (
[mcon_dataInclusao],
[mcon_codigo_pk],
[mcon_valor],
[mcon_tipo],
[mcon_obs]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
In the tests I did, index 1 was faster, but I thought 2 would be faster. Could someone explain to me why index 1 is faster, or if there is something wrong with it, so 2 would have to be faster.
Editing
Doing more testing here, it looks like index 2 is even faster. Anyway I'm doubtful. I need to have a clear understanding of how to sort these columns in the index.