Filters or Sorting, which should be the first when creating an index in the database

7

Given, for example, the following Query:

SELECT 
  ClienteId,
  Nome,
  DataNascimento,
  Cidade,
  Estado,
  DataCadastro
FROM
  Cliente
WHERE
  Estado = :Estado AND
  Cidade = :Cidade
ORDER BY
  DataCadastro, DataNascimento DESC

In terms of better utilization of the index and perfomance , my index should consider first the ordering and then the filter.

CREATE INDEX TESTE1 (DataCadastro ASC, DataNascimento DESC, Estado, Cidade)

Or first the filter and in the order sequence?

CREATE INDEX TESTE2 (Estado ASC, Cidade ASC, DataCadastro ASC, DataNascimento DESC
    
asked by anonymous 28.10.2014 / 12:16

2 answers

4

I approach index creation as something that requires a trial and error approach.
However there are basic rules to follow:

1 - Indexes do not add performance to small tables. 2 - Many indexes can decrease performance on INSERT, UPDATE and DELETE
3 - The indexes should contain only a few columns. 4 - The columns used in the WHERE and BETWEEN clauses or that participate in a JOIN should be placed first. The remaining columns should be organized based on their level of distinction.

Given this in mind the answer to your question is:

CREATE INDEX TESTE2 (Estado ASC, Cidade ASC, DataCadastro ASC, DataNascimento DESC)  

However, if the ORDER BY DataCadastro, DateChannel DESC clause is used many times and prevails in the primary key , you should consider creating a CLUSTERED INDEX in these columns.

In this case two Indexes would be created:

CREATE INDEX index1 (Estado ASC, Cidade ASC)
CREATE CLUSTERED INDEX index2 (DataCadastro ASC, DataNascimento DESC)  

The data would be written sorted by DataCadastro ASC, DataNascimento DESC instead of by primary key .

UPDATE
The exchange of comments between me and jean, led me to research further on the subject.

Using a CLUSTERED INDEX in a field that is not "always growing" may result in higher INSERT times. However, sorts are always longer than an INSERT . The cost / benefit ratio should be evaluated.

In the case of the question, we can ensure that the index is "always growing" if DataCadastro is of type DateTime .

If you want to become an expert in creating indexes follow this link . See also this explanation of how CLUSTERED INDEX works

    
28.10.2014 / 15:34
2

The order of creation of the indexes does not change their use.

As for leaving an index covering multiple columns I suggest you create two, one for filters and one for sorting, as the DBMS engine uses the filters and sorts them in distinct phases of processing.

What indexes to create ...

Depends on the use you will make of the table, the size of the table of the number of times / time you will go through the columns, make queries, inserts, etc.

I advise you to make trial and error and study the tools that your DBMS has as profiles, wizards, query plans, etc.

Indices d + hinder, wrong indexes idem. Read a lot because this is a very complex subject to fit into an answer. If you want a more detailed analysis, create a more specific question, but then you will need to post more details, such as the object creation queries and the most frequent queries being used, query plans, etc.

As for the clustered indexes mentioned by @ramaral it is a bad practice to use them with "random" fields.

* random: In this case I mean that they are non-sequential values, it can be a birth date or a guid. The problem is that as the table is going to be physically sorted by this column (s) when you enter a new value this will force a table reordering (causing page splits, etc). This creates an immense overhead plus potentially a good portion of the table will be locked during the insert operation (due to page splits). So just create clustered indexes in sequential fields (obs, PKs are usually clustered indexes)

    
28.10.2014 / 16:39