java.util.Map, best implementation considering just get (Object key)

10

I would like to know which class that implements java.util.Map contains get(Object key) faster.

The purpose is to make a mini data cache, the volume of information will be approximately 2 to 10 thousand records.

Map<Integer, Funcionario> mapUsuarios = new ???<>();

    
asked by anonymous 18.08.2015 / 22:38

3 answers

1

To solve my problem I ran a test with some implementations of Map, after populating identically with 1400 objects I executed.

speedTeste(hashMap, "hashMap");
speedTeste(treeMap, "treeMap");
speedTeste(concurrentHashMap, "concurrentHashMap");

public static void speedTeste(Map map, String type) {

System.out.println("teste velocidade mapa: " + type);
    int count = 0;
    long start = System.currentTimeMillis();
    while (count++ < 10000000) {
        map.get(1);
    }
    System.out.println("\t " + (System.currentTimeMillis() - start));

}

Results:

  • test map speed: HashMap 144

  • test map speed: TreeMap 559

  • test map speed: ConcurrentHashMap 207

That is, even the TreeMap having organized the keys in this context is much slower than the HashMap to get the value.

    
28.09.2015 / 21:23
10

Most likely HashMap , whose implementation is the simplest among the alternatives available: a simple spreadsheet non-synchronized, non-concurrent, unordered (in the sense that there is no guarantee in the order that its iterator returns elements) and non-navigable (in the sense of NavigableMap ).

The documentation suggests an optimization when instantiating HashMap :

  

This implementation provides constant-time performance for the basic   operations (get and put), assuming the hash function disperses the   elements properly among the buckets. Iteration over collection views   requires time proportional to the "capacity" of the HashMap instance   (the number of buckets) plus its size (the number of key-values   mappings). Thus, it is very important not to set the initial capacity   too high (or the load factor too low) if iteration performance is   important.

     

An instance of HashMap has two parameters that affect its performance:   initial capacity and load factor. The capacity is the number of   buckets in the hash table, and the initial capacity is simply the   the hash table is created. The load factor is a   measure of how full the hash table is allowed to get its   capacity is automatically increased. When the number of entries in the   hash table exceeds the product of the load factor and the current   capacity, the hash table is rehashed (that is, internal data   structures are rebuilt) so that the hash table has approximately twice   the number of buckets.

     

As a general rule, the default load factor (.75) offers a good   tradeoff between time and space costs. Higher values decrease the   space overhead but increase the lookup cost (reflected in most of the   operations of the HashMap class, including get and put). The expected   number of entries in the map and its load factor should be taken into   account when setting its initial capacity, so as to minimize the   number of rehash operations. If the initial capacity is greater than   the maximum number of entries divided by the load factor, no rehash   operations will ever occur.

The idea is that, if memory is available, avoid costly rehash processes by adopting an initial capacity greater than the quotient between the expected number of inputs and the load factor . In your case that is 2 to 10 thousand elements, so it would be good to instantiate the HashMap with an initial capacity in the order of 2667 to 13334, since the standard and recommended load factor is 0.75.

Note that if it is in Android , it is worth knowing the android.util.SparseArray class, which is a more efficient implementation than a HashMap for use with integer keys.     

18.08.2015 / 22:53
7

It depends on how popular you go and who will access the data the data.

If the data is preloaded before use or at least there is only one thread accessing the map, then HashMap solves the problem and we can consider Piovezan's response as "canonical."

However, if the idea is popular, it caches on demand, that is, it adds values only when the first access to them occurs, and the access is concurrent, as in a web application where more than one request can try to read and write the cache at the same time, so the scenario is completely different.

In the second scenario, HashMap is expected to cause problems, but may be replaced by concurrent implementations such as ConcurrentHashMap .

If the cache can be preloaded and its content is relatively unchanged over time, consider also using an unchanging map. To do this, simply use the static method Collections.unmodifiableMap to create a wrapper on your original map to avoid concurrent access problems.

    
19.08.2015 / 09:11