await Task.WhenAll how to run multiple processes?

5

I am trying to create several task (async) they will execute the following logic:

  • Parse an Html based on a url received with the HtmlAgilityPack
  • Return a product model after parse
  • Insert the Product into the database
  • Download product images
  • Mark url as read
  • Items 1 and 4, especially 4 take because of the speed of the internet link, so they should be async. But I'm having trouble, all my code runs, but synchronously.

     private static void Main(string[] args)
    {
         IEnumerable<UrlsProdutos> registros = db.UrlsTable.Where(w => w.Lido == false).Take(1000);
    
      ExecutaTarefasAsync(registros).Wait();
    }
    
    
      public static async Task ExecutaTarefasAsync(IEnumerable<UrlsProdutos> registros)
            {
                var urlTasks = registros.Select((registro, index) =>
                {
                    Task downloadTask = default(Task);
    
                    //parsing html
                    var produtoTask =  ExtraiDados.ParseHtml(registro.Url);
                    if (produtoTask.IsCompleted)
                    {
                        var produto = produtoTask.Result;
                        //aqui faço um insert com Dapper
                        downloadTask = InsertAdo.InsertAdoStpAsync(produto);
                    }
    
                    //marca url como lida, igual ao insert do produto
                    InsertAdo.MarcaComoLido(registro.UrlProdutoId);
    
                    Output(index);
    
                    return downloadTask;
                });
    
                await Task.WhenAll(urlTasks);
            }
    
            public static void Output(int id)
            {
                Console.WriteLine($"Executando {id.ToString()}");
            }
    

    The insert made a fixed just to test

    public static async Task InsertAdoStpAsync(Imovel imovel)
    {
        var stringConnection = db.Database.Connection.ConnectionString;
        var con = new SqlConnection(stringConnection);
        var sqlQuery = "insert tblProdutos...etc..etc"
        con.ExecuteAsync(sqlQuery);
    }
    

    I do not know if each function should be async. or if I could select type Download and parse be async ..

    My async photo download system works perfectly.

      public static async Task DownloadData(IEnumerable<FotosProdutos> urls)
            {
                var urlTasks = urls.Select((url, index) =>
                {
                    var filename = "";
    
                    var wc = new WebClient();
                    string path = "C:\teste\" + url.FileName;
    
                    var downloadTask = wc.DownloadFileTaskAsync(new Uri(url), path);
                    return downloadTask;
                });
    
                await Task.WhenAll(urlTasks);
            }
    

    I need help to make and understand how the Execute TasksAsync is actually async equal to the photos that I have not even been able to incorporate into this project.

    NOTE: I do not know if I download the photos in the parse or if I put this task.

        
    asked by anonymous 11.10.2016 / 20:36

    2 answers

    3

    One point I always reinforce in questions about async / await: It does not make execution of an asynchronous method alone, it only allows the programmer to write methods in a stream of execution close to what would be written to synchronous methods. The big question is that async / await signals to the compiler that upon finding an await, it will wait until the execution is finished, but without blocking the main thread (Thead of UI in Windows Forms or Thread of the IIS pipeline in web applications, for example).

    On your specific case, the use of async / await is not correct. When a method is async, although you say it returns a Task, it does not mean that whoever uses it needs to get this Task as a return. When using await in your call, you are already telling the compiler that it should wait for the method to execute. I do not know if I could explain in a clear way, but I think with the example will give to understand this dynamic.

    To make execution really asynchronous, in order to make each record run in different threads in a "parallel" way, I did the following:

    // fiz as classes aqui só pra conseguir executar e mostrar um caso de execução com 5 segundos de duração pra cada chamada
    
    public class FotosProdutos
    {
        public string FileName { get; set; }
        public string Url { get; set; }
    }
    
    public class UrlsProdutos
    {
        public int UrlProdutoId { get; set; }
        public string Url { get; set; }
    }
    
    public class Imovel
    {
    
    }
    
    public class InsertAdo
    {
        public static async Task InsertAdoStpAsync(Imovel imovel, int index)
        {
            await Task.Delay(TimeSpan.FromSeconds(5));
    
            Console.WriteLine(String.Format("{0} - InsertAdoStpAsync {1}", DateTime.Now, index));
        }
    
        public static async Task MarcaComoLido(int urlProdutoId, int index)
        {
            await Task.Delay(TimeSpan.FromSeconds(5));
    
            Console.WriteLine(String.Format("{0} - MarcaComoLido {1}", DateTime.Now, index));
        }
    }
    
    public class ExtraiDados
    {
        public static async Task<Imovel> ParseHtml(string url, int index)
        {
            await Task.Delay(TimeSpan.FromSeconds(5));
    
            Console.WriteLine(String.Format("{0} - ParseHtml {1}", DateTime.Now, index));
    
            return new Imovel();
        }
    }
    
    class Program
    {
        static void Main(string[] args)
        {
            // coloquei qualquer coisa aqui só pra eu conseguir reproduzir sem a sua dependência de Dapper
            IEnumerable<UrlsProdutos> registros = new List<UrlsProdutos>() { new UrlsProdutos { UrlProdutoId = 1 }, new UrlsProdutos { UrlProdutoId = 2 }, new UrlsProdutos { UrlProdutoId = 3 } };
    
            // Roda uma Task diferente pra cada registro.
            // Do jeito que você estava fazendo, sem o Task.Run(), acontecia basicamente a mesma coisa que um loop for executando item a item sincronamente a sua coleção
            var tarefas = registros.Select((registro, index) =>
            {
                return Task.Run(async () => await ExecutaTarefaAsync(registro, index));
            });
    
            Task.WaitAll(tarefas.ToArray());
    
            // espera mais um pouco só pra vermos uma diferença até o log de fim
            Task.Delay(TimeSpan.FromSeconds(5));
    
            Console.WriteLine(String.Format("{0} - Acabou!", DateTime.Now));
    
            Console.Read();
        }
    
        // Mudei o seu método pra se referir apenas a um registro só pra ser mais didático
        public static async Task ExecutaTarefaAsync(UrlsProdutos registro, int index)
        {
            Output(index);
    
            // chama o seu método de parse, falando pra esperar tudo o que tiver de assíncrono nele, e pega o retorno logo em seguida
            var produto = await ExtraiDados.ParseHtml(registro.Url, index);
    
            // como o seu parse já acabou, insere o registro com o resultado dele
            await InsertAdo.InsertAdoStpAsync(produto, index);
    
            // por fim, marcar todo mundo como lido
            await InsertAdo.MarcaComoLido(registro.UrlProdutoId, index);
    
            Output(index);
        }
    
        // não usei, mas mudei ele pra você ver a questão do await
        public static async Task DownloadData(FotosProdutos url, int index)
        {
            var wc = new WebClient();
            string path = @"C:\teste\" + url.FileName;
    
            // aqui você não precisa pegar a Task, ao usar o await ele já entende que você quer esperar o resultado do método async pra prosseguir na execução do método
            await wc.DownloadFileTaskAsync(new Uri(url.Url), path);
    
            Console.WriteLine(String.Format("{0} - DownloadData {1}", DateTime.Now, index.ToString()));
        }
    
        public static void Output(int id)
        {
            Console.WriteLine(String.Format("{0} - Executando {1}", DateTime.Now, id.ToString()));
        }
    }
    

    While running this example, I had the following output:

    13/10/2016 00:56:27 - Executando 1
    13/10/2016 00:56:27 - Executando 2
    13/10/2016 00:56:27 - Executando 0
    13/10/2016 00:56:32 - ParseHtml 1
    13/10/2016 00:56:32 - ParseHtml 2
    13/10/2016 00:56:32 - ParseHtml 0
    13/10/2016 00:56:37 - InsertAdoStpAsync 0
    13/10/2016 00:56:37 - InsertAdoStpAsync 1
    13/10/2016 00:56:37 - InsertAdoStpAsync 2
    13/10/2016 00:56:42 - MarcaComoLido 2
    13/10/2016 00:56:42 - MarcaComoLido 1
    13/10/2016 00:56:42 - Executando 2
    13/10/2016 00:56:42 - Executando 1
    13/10/2016 00:56:42 - MarcaComoLido 0
    13/10/2016 00:56:42 - Executando 0
    13/10/2016 00:56:42 - Acabou!
    

    That is, since each method takes 5 seconds to execute, we can see that it has created a different Thread for each record.

        
    13.10.2016 / 06:11
    3

    To do an asynchronous function, you use Task , as you have already discovered. The right way to do something like this is this:

    public static Task MakeRequest(int i) { 
    
        return Task.Run(() => {
    
           // seu codigo aqui
        });
    
    }
    
    public static void Main(string[]) {
    
        var tasks = new List<Task>();
    
        for (int i = 0; i < 10; i++)
        {
            tasks.Add(MakeRequestAsync(i));
        }
    
        // Aguarda todos MakeRequestAsync terminarem.
        Task.WaitAll(tasks.ToArray());
    }
    

    In this way, when you call MakeRequestAsync you can use await as expected with asynchronous methods:

    var resp = await MakeRequestAsync(i);
    

    It's also worth noting that a Task gives no guarantee of when it's going to be executed: it can either start to run immediately or queue up.

    Another important thing is to understand the difference between Task.WhenAll and Task.WaitAll . Task.WhenAll returns another Task you can expect ( await ) as it is interesting, and the code keeps running, ie it is a "non-blocking" function. The Task.WaitAll will stop the code flow until all Task s is executed.

    In short: your method that should be asynchronous should actually trigger a Task and return a awaitable object. The easiest way to do this is to use Task.Run .

        
    13.10.2016 / 04:43