Many times the query planner realizes that the performance of a sequential disk read out is more "cheap" for the system than looking through the index and doing random reads.
Without index, the DB can be sweeping the records (even out of order), and taking only what matters (and eventually already sorting in the output):
SELECT * FROM x BETWEEN( 3, 10 )
0 > 1 > 8 > 5 > 128 > 2 > 882 > 9 > 7 > 11 > 92 > 6 > 4 > 10 > 3 ...
| | | | | | | |
3 4 5 6 7 8 9 10 ...
In the above case, the query "looks" one by one, compares, and accumulates what matters.
When you use a low limit, for example 3, it pays to use the index, despite query overhead and random reading
SELECT * FROM x BETWEEN( 3, 10 ) LIMIT 4
INDICE: 0, 1, 2, 3 (start), 4, 5, 6 (stop), 7, 8, 9, 11, 92 ...
BUSCA: -> -> <- ->
DB: 0 > 1 > 8 > 5 > 128 > 2 > 882 > 9 > 7 > 11 > 92 > 6 > 4 > 10 > 3 ...
In the above case, as there are few, it is worthwhile to start with the index, to pick up the first one, to find where it is on the disk, to read that part of the data (remembering that if it was only the value of the indexed column and would not need the DB, ), go to the next index, find the data in the DB, read the data, go back to the index novalmente to see the next, read the DB data ... etc.
Note that in the first case, it does not have this "come and go" in the index, because the amount does not compensate to be reading a thousand separate pieces of the file that contains the data, since the sequential reading of them is usually faster. p>
In the second case, there is little information, so doing this "back and forth" compensates, because even with random reads, much less data will be read than the DB.
I'm assuming the most common databases, working on normal HDs. Has DBs that specialize in memory access or SSDs. In fact, the query planner takes these decisions based on a series of statistics, which may not give the best result in all situations, but usually in most cases the gain justifies. What's important to understand is that someone programmed it in this way, and in all probability they have done an absurd amount of testing to improve their operation.
From the MySQL manual
See, for example, these links:
link < sup> (en)
link (en)
Highlight for this line:
You are comparing indexed columns with constant values and MySQL has calculated (based on the index tree) that the constants cover too large to part of the table and that the table scan would be faster.
You are comparing indexed columns with constant values, and MySQL has calculated, based on the index tree, that they cover such a large part of the table, that scanning it directly will be faster.
And this:
You are using a key with low cardinality (many rows match the key value) through another column. In this case, MySQL assumes that by using the key it will probably do many key lookups and that the table scan would be faster.
You are using a low cardinality key (many lines hit the key value) through another column. In this case, MySQL assumes that by using the key, it will do many searches for it, and it is faster to scan the table.
Although they are not the exact scenario of LIMIT, the same logic applies.