What are the differences between the HASH and BTREE algorithms used in an index?

3

I realized that I can create two types of indexes for a determining field in HeidiSQL, which use the algorithm HASH or BTREE , see below:

See the CREATE code for a sample table for the illustration:

CREATE TABLE 'pessoa' (
    'nome' VARCHAR(50) NOT NULL,
    'email' VARCHAR(50) NOT NULL,
    INDEX 'index_nome' ('nome') USING BTREE,
    INDEX 'index_email' ('email') USING HASH
)

So, I had some doubts about these two types of algorithms that can be used.

Questions

  • What is the HASH algorithm?
  • What is the BTREE algorithm?
  • What are the differences between HASH and BTREE?
  • asked by anonymous 25.11.2017 / 18:50

    1 answer

    3

    I have already replied in the context of PostgreSQL

    .

    I already gave details about Btree

    I have already talked about the hash code .

    has been replied to about the hash tables.

    The documentation shows the difference . In written form it seems like they do not want you to use hash . They only show disadvantages. Which is something very realistic.

    It is very rare for it to be useful, especially on disk where it often forces much more reading. The literal translation of hash table is a spreading table, and anything that gets scattered is bad for accessing certain media, or it damages the cache resulting in more thrashing. >

    The main disadvantage is can only compare equality, which brings other implications as it can not maintain order, which will bring a cascade of implications.

    But if it is in memory and you just need to test the equality and access to each element is usually individual and there are few key collisions, either because the original data does not repeat itself or the result of the hash < in> does not repeat much, then it can be faster than any binary tree or B tree. If you have a lot of writing almost certainly there will be gains in this operation (not in the rare worst cases).

    A B tree can have a lot of internal maintenance on tables with lots of writing, but reading is always very optimized.

    The hash index is only useful if the person understands all the implications well and has made tests that demonstrate clear gain. So it's almost hidden and limited to certain engines of MySQL.

    There is an internal hash index that the database uses when it understands that it is best to organize query results, but it is a detail that does not matter to anyone using MySQL.

        
    26.11.2017 / 16:48