For this particular query you will use the index if there is a suitable index for it. This index should include the columns status
, account
and data
. Preferably in this order. There is no reason why MySQL does not use the index automatically. Anyway to make sure create and see what happens. Not always what we deduce is correct.
So the index will sort status
incrementally, so it can find fast rows with 1
. Then it can quickly find the rows that have the value 12
to account
, among the rows already selected that satisfy the previous condition . And finally, it will select all dates above the value indicated in query , again between the lines already selected previously in the previous conditions. Obviously this selection will not consider other data ranges that do not contain the previous values.
Note that there must be a single selection range to fully index the query . Therefore the order can be important to determine whether or not to use the index. So I said that in this specific query the index with the indicated composition would be used. Another query may not be able to use the index or may partially use it to select a range through it and the rest of the selection has to be sweeping the database line by line between the lines already selected in the index.
Normally the key is to have indexes that meet the main queries and avoid queries that can not benefit from indexes.
The indexes work well when you can mount sub-tracks with them. In a very complex query , which makes it difficult to get these sub-strips, there is no index that solves it. Then the ideal is to try to at least greatly reduce the lines that were obtained through the index and leave little to the final selection.
To try to ease understanding, think about how this search works. When you have something sorted according to what you want to search it is easy to find through a binary search that has a logarithmic complexity O (logN), that is, it divide and conquer . Even with billions of lines, in just over 30 steps he finds what you want. When there is no order, the way is to look one by one, linear complexity O (N). If the query looks at several columns it will try to use the first method and make a very quick selection. When it is no longer possible, just between the lines you have already managed to select fast it will apply the individually determined condition line by line. To understand a little more about Big Read this answer .
When the optimizer does not use an existing index, it is unable to figure out a way to "understand" its query properly for the index. This may be the fault of the way the index is mounted or it may be because the optimizer is not so good. Almost always the optimizer does its job well.
It may actually be that the index is not used even though it has an appropriate key. There are small data volumes that direct fetch in the data has a lower cost. The optimizer can identify that even though it can select sub-stripes, subsequent analysis can be so costly that dropping the query to the index can yield a better result. But in these cases the optimizer has relinquished its use in a "conscious" way for its benefit. You can not think bad. Although there are cases he misses.
But be careful not to create too many indexes . Updates to them can slow the software down.
What the AP did in editing the question is exactly what needs to be done. Try and see what actually happens. With a simple key using only the most relevant column it achieved a huge gain. It used the status
column. I do not know if it tried, but I could experiment only with the column account
, only with the column data
, and make the possible combinations between the three columns and see how each result is. As the gain seems to be fundamental, it's worth the little effort.