Faster way to access properties in a list C #

6

I have a project that works with a large volume of data, and I need to optimize it to bring results of some calculations in a considerably small time. I know I have several aspects to take into consideration, such as the structure in which the data was saved in the Database, the way I am accessing them, how I am performing the calculations among others, but disregarding all these items, I would like be taken into account only the question posed below.

My question is more conceptual than a problem in my code. But something specific ...

Consider the following list:

var minhaLista = new List<MeuObjeto>
{
   // Objetos...
};

meuObjeto has the following properties:

public class MeuObjeto
{
  public int Prop1 {get; set;}
  public string Prop2 {get; set;}
  public decimal Prop3 {get; set;}
  public bool Prop4 {get; set;}
}


I need to access each of the properties in a% loop of% items as quickly and economically as possible in memory. But if I have to choose between speed and memory, I should choose speed.

Every millisecond is very important, so I'm taking into account some aspects like n being faster than for , or else declaring a constant with foreach and using it to instantiate the control variable of the loop is better than instantiating directly with 0 .

So, consider 0 as follows:

private const int INICIO = 0;

Consider INICIO as an object similar to OutroObjeto just for example.

Form 1:

var outraLista = new List<OutroObjeto>();

for (int i = INICIO; i < minhaLista.Count; i++)
{
   var outroObjeto = new OutroObjeto
   {
      Prop1 = minhaLista[i].Prop1,
      Prop2 = minhaLista[i].Prop2,
      Prop3 = minhaLista[i].Prop3,
      Prop4 = minhaLista[i].Prop4
   };
   outraLista.Add(outroObjeto );
}
  

In this case, for each property a search in the list is made by   object at position MeuObjeto ?


Form 2:

var outraLista = new List<OutroObjeto>();

for (int i = INICIO; i < minhaLista.Count; i++)
{
   var meuObjetoI = minhaLista[i];

   var outroObjeto = new OutroObjeto
   {
      Prop1 = meuObjetoI.Prop1,
      Prop2 = meuObjetoI.Prop2,
      Prop3 = meuObjetoI.Prop3,
      Prop4 = meuObjetoI.Prop4
   };
   outraLista.Add(outroObjeto );
}
  

Apparently this snippet works similarly to i ,   but access to each property of the object at foreach of the list   will it be faster than in Form 1 ?

     

Technically i only points to the list object in   position meuObjetoI that is already allocated in memory, correct?


  

What would be the most appropriate way of taking the time and   memory consumption?
Or is there a third option that is better?

    
asked by anonymous 30.03.2016 / 15:15

3 answers

7

Pre-compute as much as possible

Zero is a constant so do not worry about it. But .Count will be reevaluated in each interaction if the compiler can not "prove" that minhaLista is a strictly local variable. As there are no considerations on this, then first optimization is to use for to decrease pressure on the garbage collector , and precompute .Count :

var count = minhaLista.Count;

for (int i = 0; i < count; i++)
{
    ...
}

This optimization assumes that the list has a constant size.

Decrease Inductions 1

As commented, each indirection to solve is one more processing, at least in theory . This can be stupid optimization, but again, if minhaLista is not strictly local, within for :

for (int i = 0; i < limit; i++)
{
    var item = minhaLista[ i ];
    var outroObjeto = new OutroObjeto
    {
        Prop1 = item.Prop1,
        Prop2 = item.Prop2,
        Prop3 = item.Prop3,
        Prop4 = item.Prop4
    };
    outraLista.Add(outroObjeto );
}

This cuts the repeated accesses type minhaLista[i].Obj .

Decreasing Inductions 2

Simple properties, or properties with simple elevators ( {get;set;} ) have extremely optimized code, so the above code, although it seems verbal, may already be as fast as possible. However a copy like this may violate the principle of DRY , then create a constructor of OutroObjeto that accepts MeuObjeto is interesting because:

  • Enhances the expressiveness of your code
  • Decrease indirection, since half of the properties will be local access to one of the objects.
  • An alternative is to hang a .ToOutro() method on MeuObjeto , doing something similar to the contents of for , above.

    var minhaLista = new List<MeuObjeto>();
    ...
    
    var outraLista = new List<OutroObjeto>();
    var count = minhaLista.Count;
    
    for (int i = 0; i < count; i++)
    {
        outraLista.Add( minhaLista[ i ].ToOutro() );
    }
    

    Depends somewhat on personal taste. If it is preferable to .To of life but ugly to mix domains of the application, an extension method is the way to have a code that binds the objects, without being in any of the objects.

    Readonly collection

    The List machine is complicated. Has to be. But trying to replace that functionality with a home-made implementation does not have the best of prognostics there. However "all" what your code above does is copy a list of objects into another. Give the question, new list to be dynamic?

    If the answer is no, specifically if the new created list is not then modified in terms of number of records , then:

    var minhaLista = new List<MeuObjeto>();
    ...
    var count = minhaLista.Count;
    var outraLista = new OutroObjeto[count];
    
    for (int i = 0; i < count; i++)
        outraLista[ i ] = minhaLista[ i ].ToOutro();
    

    Readonly object

    This is a radical proposition, but one that is actually used in places where performance is an absolute priority: read-only objects, and if they are small enough, structs instead of objects.

    It is not a transition that should be done without much study. Exiting from mutant classes to struct is difficult, and prone to errors. If these objects come from a database, then the suggestion is to not go this way.

    But if these objects are somehow ephemeral data, loaded to be discarded, or calculation products, using read-only classes / structs gives you the assurance that these data will not be changed under any circumstances, which may prevent this type of conversion, and opens some compiler-level optimizations that can speed up code execution.

        
    30.03.2016 / 20:24
    4
      

    In this case, for each property is a search in the list by the object in position i?

    Search is not exactly the word, I think it implies a long process. A direct access to the element is done according to the index. It is an access to a memory point like any variable.

    Essentially it makes no difference whether you are accessing the variable as an element or as an entire object. If at the end you have to access all elements, this little changes. Of course it depends on what you are going to do. What might be better in one context may be worse in another.

    I would not risk saying for sure which would be faster without testing under proper conditions, which we do not always know how to do. Yet the test result in simple operation may be irrelevant within another context. I have seen several times the person testing under controlled conditions to get the most accurate result and then when using in production something else happens.

      

    Apparently this snippet works similarly to foreach, but will access to every property of the object at position i in the list will be faster than in Form 1?

    Nothing similar to foreach . The foreach works with an enumerator object, accesses other methods, has more difficulty of optimization and mainly has a proposal very different from that presented.

      

    Technically myObjectI only points to the list object at position i that is already allocated in memory, correct?

    Correct. In this case. You will receive the value stored in the list element. Because a object is referenced , you will receive the reference (or pointer if you prefer ) for the object and not the object itself. If the type was of value it would receive it.

      

    What would be the best way to take into account time and memory consumption?   Or is there a third option that is better?

    I see no reason for the second way. She's worse. I have serious doubts if you really need to do this. What I can say is that without context everything can be right or wrong. I think I can do better, but it depends on the context.

    Conclusion

    Overall, I do not know if every millisecond is so important. Mainly without context. I see very experienced people doing everything wrong trying to optimize. Optimizing right is harder than it sounds.

    I do not think that's the case but often choosing not to worry about memory is choosing to have worse speed. To see how complex it is.

    I do not know if it's so true that foreach is slower than for . I already showed here in SOpt that is not quite like . This is one of the reasons that people are wrong trying to optimize, they believe in false truths.

    I hope this answer serves even more to demystify the idea of optimization than to answer the specific case that does not even seem relevant.

        
    30.03.2016 / 17:58
    2

    The two forms grow linearly with the size of your list, so the difference is small.

    But the second form is a bit worse in terms of performance and memory because although it's just a pointer, you need to allocate the address and memory so it points to the object of myList [i] every time and this is a extra step and unnecessary.

    As you otherwise asked, you could move to the OtherObject constructor and let it handle the required properties internally, without having to allocate temporary variables, since you add the reference in your otherList.

    Form 3:

    for (int i = INICIO; i < minhaLista.Count; i++)
    {
        outraLista.Add(new OutroObjeto(minhaLista[i]));
    }
    
        
    30.03.2016 / 15:26