What are INDEX, B-tree, hash, GiST, and GIN?

3

In the PostgreSQL manual has the following excerpt:

  

PostgreSQL provides the index methods B-tree, hash, GiST, and GIN.   PostgreSQL provides the B-tree, hash, GiST, and GIN index methods.

But what are these methods? and what are their differences?

    
asked by anonymous 30.11.2015 / 14:41

1 answer

7

B-Tree

The B-tree , or some variation of it, is the most common in all banking systems of data. It is very efficient for almost all common use cases. It is a balanced binary tree that allows all types of access (read, insert, remove, anywhere) at time O (logN), which is very fast always, with very large volume, or very small data, but it matters even when it is large. It is reasonably space efficient and keeps data orderly, allowing for sequential and track reading efficiently.

Hash

In cases where you do not need the order, the sequential search and keys are unique (where the programmer guarantees this, the index can not), avoid collision and have large data volume, the index can be more efficient if you use a hash calculation function. They have O (1) time. But that does not mean clearly more quickly. The calculation time of the function can be greater than some iterations in a binary tree (and there are usually few). It does not usually work well on disk.

The hash framework may still have some difficulties when there are collisions and linear searches (O (N)) are necessary (although rare, unless the keys are not unique). But one can "complain" about the loss of extra time in the B-tree when it needs rebalancing. But this occurs in more specific circumstances. In all cases, it depends on the algorithm used and the data pattern.

There are cases, but it is rare to have an expressive gain due to its use. There may be gains in certain very complex%% types.

Just testing to make sure it's worth it.

It is not crash-safe , nor is it replicated, which makes more cases impossible.

GIN

GIN (Generalized Inverted Index) allows a key to have multiple values, ie , which points to multiple rows in the data table. This is important when the same key may be present in multiple data items. In some cases having such an index (B-tree allows this too) can be more efficient in space and access time. Do not expect miraculous gains.

A good use is to search for free texts (indicate where determinate words are present.

Another use is when a specific data type created by the user (programmer) in the database has a multiplicity characteristic.

In a way we may consider a lower level index. Something almost internal.

GiST

GiST (Generalized Search Tree) is a more abstract, more internal form where various access methodologies can be implemented. It is a balanced binary tree, but it does not have many rules. These should be specified at a higher level, probably for some specific type created by the user.

Conclusion

In summary, if you do not have a very strong motif, keep the traditional index.

    
30.11.2015 / 16:21