A hashset is like a book, in which the keys are the table of contents and the values are the chapters.
In other words ... When you insert an element into a hashset, a hash of the object. Hence you have a structure that holds information in pairs. Each pair has on one side the hash of the object and on the other a reference to the object.
It turns out that hashes are usually strings well ... random. For example, assuming the hash algorithm is the MD5 :
- string hash
foo
is acbd18db4cc2f85cedef654fccc4a4d8
;
- The hash of
bar
is 37b51d194a7513e45b56f6524f2d51f2
;
- And of
baz
is 73feffa4b7f6bb68e44cf984c85f6e88
.
If you were to sort the hashes in alphabetical or hexadecimal order, you would have to reindex the set to each added element. This would leave the inserts slow, and one of the goals of hashset is to be fast.
Complicating, it is possible to have hashes colliding. This occurs when two objects have the same hash. In this case, it is common for a hashset to add a little salt (this is a technical term!) To one of the objects and generate a new hash different to it. So the clutter in the indexes gets even bigger.
You may want to use a map of chained hashes. This structure guarantees the insertion order and there is a class in Java for this: link