Search performance involving multiple disks

4

Assuming a query that uses JOIN between two or more tables, such as:

SELECT *
FROM foo INNER JOIN bar ON foo.id = bar.id

Is there performance gain if the data in the tables are in different HD's (due to having more heads reading at the same time, I suppose) ? If so, how do I separate my tables into different storage drives?

And in the case of implicit join / join "natural"? I.e.:

SELECT *
FROM foo, bar
WHERE foo.id = bar.id
    
asked by anonymous 22.10.2014 / 22:49

2 answers

4

If done correctly, there will be a performance gain (maybe minimal depending on your application). Implementing a database on multiple HD's is a bit complex and perhaps unnecessary for the size of your project.

Trying to summarize: When you perform a join between two tables, SQL Server will query much more than the contents of the table, for example the indexes to find the information you need. If these indexes are not implemented correctly, you will lose considerable performance.

Simpler ways to improve performance:

  • Put log files / backup of the database on another hard disk.
  • Create indexes for your top queries.
  • Delete unused records (those from the backlog that are only there to undermine your query), or place them in another database.

If you are interested in partitioning anyway: link

Some tips on how to improve performance: link

    
23.10.2014 / 00:03
2

As the Wédney said, the improvement would be minimal. Actually, I do not even know if there will be real gain, after all I do not think SQL Server can read the two tables in parallel or asynchronously for the same query.

Considering the Implementation Plan

Before you begin to read the disk, database systems such as SQL Server first calculate the execution plan. The first step is to check if there is any index with which it can determine in advance which records are to be read from the physical table.

If the join is performed only by the primary keys in foo and bar , then there will certainly be indexes. Having enough memory helps SQL Server keep indexes in memory and thus avoid readings of disk indexes. So, up to this step, no reading would be necessary.

With the calculated execution plan, knowing exactly which records are to be read from the disk, SQL Server will then start to read the disk records. If they are on separate disks, it may not make any difference, after all it probably needs to wait for one table or record to be read to access another table.

Furthermore, even though the two tables are on different disks, there is no guarantee that the needle movements of the HD will be more efficient. Data from the same table may be fragmented in sectors scattered across HD tracks.

Conclusion

Therefore, it may be much more efficient to just compact your database, to eliminate unused portions of the file, and then defragment the HD,

    
23.10.2014 / 14:57