Performance in Java repetition loops

2

What a difference of performance between the three types of ties highlighted below.

List<Foo> list = new ArrayList<Foo>();

for (int i = 0; i < list.size(); i++) {
    //CODE
}

for (Foo foo : list) {
    //CODE
}

Iterator<Foo> iterator = list.iterator();
while (iterator.hasNext()) {
    //CODE
}
    
asked by anonymous 31.03.2014 / 16:28

2 answers

5

First of all, consider that many performance tests are wrong . Making a micribenchmark is not so simple when running the three codes by various internal and external factors:

  • Other programs running in the background may interfere differently in testing.
  • With multiple tests in the same run, one can affect the others. For example, the first test may have instantiated elements and the Garbage Collector runs only on the second test.
  • The test may not reflect a real context of use. In your example, you do not include access to the element. As Ricardo's response already mentioned, the bottleneck of a for or while loop used to traverse a list ( List ) may be in the get() method, which checks the index accessed and then needs to retrieve the element of the internal vector. However, I can not agree with his response, because Iterator also internally uses the get() method.
  • The JVM makes optimizations both in bytecode and during execution, and may vary by version.
  • The result varies greatly for different types of Collection s. This answer in SOEN says there are no significant differences.

Your test is very similar to this benchmark . Without access to the elements, the result obtained by the author was:

  

For each loop :: 110 ms

     

Using collection.size () :: 37 ms

     

Using [int size = list.size (); int j = 0; j < size; j ++] :: 4 ms

     

Using [int j = list.size (); j > size; j--] :: 1 ms

In short, a decrementing variable loop is the fastest in this scenario and this is probably due to the fact that comparing a variable to zero is more efficient.

Mkiong also produced a benchmark . The result:

However,thiswasmuchquestioned.Seethecommentstoseethatdifferentpeoplegotdifferentresultsondifferentplatformsandwithminormodificationstothecode.

Whenconsideringtheamountofcodeexecuted,Iteratoractuallyseemstobetheleastefficient.Seethecodeforthemethodsnext()andhasNext()ofIteratorusedinArrayList,extractedfromJDK6:

public E next() { checkForComodification(); try { E next = get(cursor); lastRet = cursor++; return next; } catch (IndexOutOfBoundsException e) { checkForComodification(); throw new NoSuchElementException(); } } public boolean hasNext() { return cursor != size(); }

Note that the size() and get() methods in the list are called anyway. There is no magic to retrieve the next element. In addition, a check is made if the list has been modified by another method. Fortunately this is done only by comparing two integers, but the call to method and comparison ends up having some impact.

Finally, through a static analysis , we can say that a loop for using an auxiliary variable ( i ) and a fixed limit value, that is, without calling size() a each iteration is more efficient.

In practice, however, the scenario can change dramatically if the programmer becomes "lazy" to assign the element to a variable. Who has not seen anything like the example below?

for (int i = 0; i < list.size(); i++) {
    if (list.get(i) != null) {
        list.get(i).metodo();
    }
}

Another factor that can change a lot is if we do not use ArrayList . The LinkedList , for example, has complexity O(n) to find any element, since it has to go through the elements one by one until it finds the desired element.

And while still thinking about the quality of software as a whole and not just from a performance perspective, for..each ties have the great advantage of being "cleaner" (less typing, easier to understand) and less prone to errors ( overflow limit, start counter in 1 instead of zero, and so on).

Another important observation, thinking about the performance issue, is that the biggest bottlenecks are in random access to the data and in the very definition of the data structure. Although some consider it overly concerned with these details, the truth is that it makes a lot of difference to choose the right structures to store the data. The concept that memory and CPU are "cheap" falls by the wayside as we begin to deal with more than a few thousand records, for example. The choice of the commands to go through the defined structure comes as a consequence of the decisions already made.

The conclusion is that there is definitely no more efficient loop for all cases. The most important emphasis is on defining an appropriate data structure for each case, considering memory consumption and the complexity of accessing a specific element of that structure.

    
31.03.2014 / 17:36
2

Using Iterator and " foreach " will not cause any performance issues, since they have similar speeds when traveling collections . The same can not be said for for normal. The problem is in access via index get(i) to for that degrades loop speed.

No SO in English has a good and complete answer on this. In addition, different types of implementations of List and their respective uses can cause performance (or worsening) to improve, as is the case of ArrayList or LinkedList .

    
31.03.2014 / 16:48