For what the index serves, I know it improves performance, but what the database does behind it improves this performance. When is it recommended to use? And where should I use an index?
For what the index serves, I know it improves performance, but what the database does behind it improves this performance. When is it recommended to use? And where should I use an index?
Editing the answer to include a slight metaphor.
TL; DR: Imagine that the database is a postman who has just arrived in a city that is unknown to him, and that the data are the recipients of the letters he has to deliver. A simpler index would be like the system of postal codes (in Brazil, CEP) while more robust indexes would be like complete addresses. Unlike the real-life mailman, our SQL Server mailman can find his target without the address - he will only need more time, because in that case he will check all the residences one by one.
More elaborate answer:
Each record in a database system has an address in the storage (hard disk or SSD that is).
When you do a search on the database that has search conditions ( WHERE
), by default SQL Server will raise all records to memory. From there he analyzes one by one, checking against the search conditions, to decide whether the record is part of the result or not.
As a table can be several times larger than the available RAM, this can be very slow.
Here is the technique of indexing. Basically, you choose one or more columns to be indexes. Then, two things can / will happen:
Imagine that you have a table with one million records, with a primary key that is integer and auto-incremental. The index could stay in RAM and contain the ID and address of one in 100 records. Now let's make a query with something like WHERE ID = X
, for any X.
If you make the query without indexed ID, you will bring up to one million records into memory and check one by one which has the desired ID.
If you query with indexed ID, the bank will read the index first. Two things can happen:
The bank finds the ID X right in the index. With this it has the address of the registry on the disk / SSD, and goes straight to it without any loss of time.
The bank does not find the ID in the index. In this case, it goes to the nearest ID in the store and navigates forward or backward (as appropriate) until it finds its record. This part of the search looks like the normal search the bank would make without the index, but as the bank already knows the approximate address the search gets much faster.
How much speed you gain from it depends, varies from case to case. For a school project bank with tables with few records, it may not make much difference. But for large databases with huge masses of data (telephone directories, for example), the difference is absurd. I can not say a precise number, but on the larger databases I worked with, the indexes lowered the query time from hours to milliseconds.
Another important issue: . This is relevant because most accessed indexes are kept in RAM . The question is 2009, but from then on the greatness of speed disparities has not changed much. For random searches, RAM is somewhere between one hundred thousand and one million times faster than a hard drive. That's the main advantage of the index.
Indexes are disk structures associated with tables or views, which speed up the process of retrieving rows from a table or a view. These indexes can contain keys from one or more columns of the table or even from a view. These keys are therefore stored in the SQL Server framework called B-tree, which enables SQL Server to find the row (s) associated with the key values quickly and effectively.
Read more at: SQL Server Cluster: Clustered and Non-Clustered Indexes link
They are elements that we use to index our data, that is, to define a "position" it within a context. Much like a vector or array. With the indexed data the searches get faster since its position is known. The big issue with indexes is insertion, because indexed data needs to be ordered to perform better during queries, and to stay ordered need to be inserted in the correct position!
Look for a little bit about data structures, they can help!