Entity Framework Left Join

2

Seeing the question Left join with lambda GroupJoin and with condition , I came up with doubts around the way I do the Left Join with linq

Having the following entities

public class Cidade 
    {
        public int CidadeId { get; set; }
        public string Descricao { get; set; }
        public int EstadoId { get; set; }
    }

public class Estado
    {
        public int EstadoId { get; set; }
        public string Descricao { get; set; }
        public string Sigla { get; set; }
    }

the way I've always done my left join

 var estados = from e in context.Estado
                              join c in context.Cidade on e.EstadoId equals c.EstadoId into cl
                              from c in cl.DefaultIfEmpty()
                              select e;

SQL generated

SELECT 
    [Extent1].[EstadoId] AS [EstadoId], 
    [Extent1].[Descricao] AS [Descricao], 
    [Extent1].[Sigla] AS [Sigla], 
    [Extent1].[UsuarioCad] AS [UsuarioCad], 
    [Extent1].[DataHoraCad] AS [DataHoraCad], 
    [Extent1].[Ativo] AS [Ativo]
    FROM  [dbo].[Estado] AS [Extent1]
    LEFT OUTER JOIN [dbo].[Cidade] AS [Extent2] ON [Extent1].[EstadoId] = [Extent2].[EstadoId]

proposed form in the answers to the question Left join with lambda GroupJoin and with condition

var dados2 = context.Estado
                    .GroupJoin(context.Cidade, e => e.EstadoId, c => c.EstadoId,
                            (estado, cidade) => new { E = estado, C = cidade.DefaultIfEmpty() })
                            .SelectMany(final => final.C, (final, ex) => new
                            {
                                EstadoId = final.E.EstadoId,
                                Descricao = final.E.Descricao,
                                Sigla = final.E.Sigla,
                                Cidade = ex.Descricao
                            }).ToList();

SQL generated

SELECT 
    [Extent1].[EstadoId] AS [EstadoId], 
    [Extent1].[Descricao] AS [Descricao], 
    [Extent1].[Sigla] AS [Sigla], 
    [Extent2].[Descricao] AS [Descricao1]
    FROM  [dbo].[Estado] AS [Extent1]
    LEFT OUTER JOIN [dbo].[Cidade] AS [Extent2] ON [Extent1].[EstadoId] = [Extent2].[EstadoId]

My question, what would be best in terms of performance?

In my opinion, the way I write Left Join is clearer, I confess that in the other approach I get a bit confused.

As far as I can see, SQL instructions are basically the same

    
asked by anonymous 12.02.2015 / 13:45

1 answer

1
  

My question, what would be best in terms of performance?

Both syntaxes are the same for the compiler that converts to the same code, at no charge since it compile time.

The code below is the one of the post quoted by TobyMosque, only in the right way. I'll discuss it below.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;

namespace IterationPerformanceTest
{
    class Program
    {
        const int SIZE = 10000;
        const int RUNS = 1000;

        static Stopwatch stopwatch = new Stopwatch();
        static Random r = new Random();

        static List<int> intList = new List<int>();

        public static void Main(string[] args)
        {
            for (int i = 0; i < SIZE; i++)
                intList.Add(r.Next(int.MinValue, int.MaxValue));

            RunLoop();            

            RunQuerySyntax();

            RunFluentSyntax();

            Console.ReadLine();
        }

        static void RunQuerySyntax()
        {
            stopwatch.Reset();

            //Query Syntax  
            stopwatch.Start();

            for (int t = 0; t < RUNS; ++t)
            {
                var result = (from int i in intList where i < 10 select i).ToList();
            }

            stopwatch.Stop();

            Console.WriteLine(string.Format("Query Syntax : {0}, avg. {1}", stopwatch.Elapsed, new TimeSpan(stopwatch.ElapsedTicks / RUNS)));
        }

        static void RunFluentSyntax()
        {
            stopwatch.Reset();

            //Fluent Syntax
            stopwatch.Start();

            for (int t = 0; t < RUNS; ++t)
            {
                var result = intList.Where(i => i < 10).ToList();
            }

            stopwatch.Stop();

            Console.WriteLine(string.Format("Fluent Syntax: {0}, avg. {1}", stopwatch.Elapsed, new TimeSpan(stopwatch.ElapsedTicks / RUNS)));   
        }

        static void RunLoop()
        {
            stopwatch.Reset();

            //LOOP  
            stopwatch.Start();

            for (int t = 0; t < RUNS; ++t)
            {
                var result = new List<int>();

                foreach (var i in intList)
                {
                    if (i < 10) result.Add(i);
                }
            }

            stopwatch.Stop();

            Console.WriteLine(string.Format("Loop: {0}, avg. {1}", stopwatch.Elapsed, new TimeSpan(stopwatch.ElapsedTicks / RUNS)));

        }
    }
}

I've isolated the calls to get easy on CIL to analyze and something else ...

Query Syntax Loop:

// loop start (head: IL_004a)
        IL_0010: nop
        IL_0011: ldsfld class [mscorlib]System.Collections.Generic.List'1<int32> IterationPerformanceTest.Program::intList
        IL_0016: call class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0> [System.Core]System.Linq.Enumerable::Cast<int32>(class [mscorlib]System.Collections.IEnumerable)
        IL_001b: ldsfld class [mscorlib]System.Func'2<int32, bool> IterationPerformanceTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
        IL_0020: brtrue.s IL_0035

        IL_0022: ldnull
        IL_0023: ldftn bool IterationPerformanceTest.Program::'<RunQuerySyntax>b__0'(int32)
        IL_0029: newobj instance void class [mscorlib]System.Func'2<int32, bool>::.ctor(object, native int)
        IL_002e: stsfld class [mscorlib]System.Func'2<int32, bool> IterationPerformanceTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
        IL_0033: br.s IL_0035

        IL_0035: ldsfld class [mscorlib]System.Func'2<int32, bool> IterationPerformanceTest.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
        IL_003a: call class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0> [System.Core]System.Linq.Enumerable::Where<int32>(class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0>, class [mscorlib]System.Func'2<!!0, bool>)
        IL_003f: call class [mscorlib]System.Collections.Generic.List'1<!!0> [System.Core]System.Linq.Enumerable::ToList<int32>(class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0>)
        IL_0044: stloc.1
        IL_0045: nop
        IL_0046: ldloc.0
        IL_0047: ldc.i4.1
        IL_0048: add
        IL_0049: stloc.0

        IL_004a: ldloc.0
        IL_004b: ldc.i4 1000
        IL_0050: clt
        IL_0052: stloc.2
        IL_0053: ldloc.2
        IL_0054: brtrue.s IL_0010
    // end loop

Fluent Syntax Loop

// loop start (head: IL_0045)
        IL_0010: nop
        IL_0011: ldsfld class [mscorlib]System.Collections.Generic.List'1<int32> IterationPerformanceTest.Program::intList
        IL_0016: ldsfld class [mscorlib]System.Func'2<int32, bool> IterationPerformanceTest.Program::'CS$<>9__CachedAnonymousMethodDelegate3'
        IL_001b: brtrue.s IL_0030

        IL_001d: ldnull
        IL_001e: ldftn bool IterationPerformanceTest.Program::'<RunFluentSyntax>b__2'(int32)
        IL_0024: newobj instance void class [mscorlib]System.Func'2<int32, bool>::.ctor(object, native int)
        IL_0029: stsfld class [mscorlib]System.Func'2<int32, bool> IterationPerformanceTest.Program::'CS$<>9__CachedAnonymousMethodDelegate3'
        IL_002e: br.s IL_0030

        IL_0030: ldsfld class [mscorlib]System.Func'2<int32, bool> IterationPerformanceTest.Program::'CS$<>9__CachedAnonymousMethodDelegate3'
        IL_0035: call class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0> [System.Core]System.Linq.Enumerable::Where<int32>(class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0>, class [mscorlib]System.Func'2<!!0, bool>)
        IL_003a: call class [mscorlib]System.Collections.Generic.List'1<!!0> [System.Core]System.Linq.Enumerable::ToList<int32>(class [mscorlib]System.Collections.Generic.IEnumerable'1<!!0>)
        IL_003f: stloc.1
        IL_0040: nop
        IL_0041: ldloc.0
        IL_0042: ldc.i4.1
        IL_0043: add
        IL_0044: stloc.0

        IL_0045: ldloc.0
        IL_0046: ldc.i4 1000
        IL_004b: clt
        IL_004d: stloc.2
        IL_004e: ldloc.2
        IL_004f: brtrue.s IL_0010
    // end loop

Virtually twins!

Note that the example has refactored the calls in isolated methods, try changing the order of Fluent with Query and depending on your processor / load the low will always be slightly faster, this has nothing to do with query performance and yes cache hit, each machine will have a different gain, since it is always the same collection to be traversed, if it is in the processor cache it is faster.

Very careful with the examples on the internet, this quote contained several errors of analysis and technique.

    
13.02.2015 / 02:11