Memory Error When Inserting Millions of Records Using Entity Framework

7

I'm using the Entity Framework to do insert and update thousands of records.

At first it was slow, but after putting the code below it improved the speed.

db.Configuration.AutoDetectChangesEnabled = false;  
db.Configuration.ValidateOnSaveEnabled = false;

Speed Resolved.

However, I can not get to the end because, as you enter, the model is getting bigger and increasing the memory consumption until I get an exception when I reach 1.5GB.

Note: I've already tried to use AsNoTracking() .

I'm also trying to reload the model from time to time, but it does not lower consumption. It only increases.

Has anyone experienced this or has any ideas?

Part of the code:

foreach (var prd in produtoGradeAux)
{
    if (dbPdv.Database.Connection.State != ConnectionState.Open)
       dbPdv.Database.Connection.Open();

using (var transaction = dbPdv.Database.Connection.BeginTransaction(System.Data.IsolationLevel.ReadUncommitted))
{
    dbPdv.Database.UseTransaction(transaction);
    i++
    Produto prodAux = null;
    var pAux = dbPdv.produto_grade.AsNoTracking().FirstOrDefault(x => x.produto_GradeIdGw == prd.produto_gradeID);

    if (prd.cd_grade.Trim().Length > 6)
        pAux = dbPdv.produto_grade.AsNoTracking().FirstOrDefault(x => x.cd_grade.Trim() == prd.cd_grade.Trim());

    if (prd.cd_grade.Trim().Length > 6)
        prodAux = dbPdv.produto_grade.AsNoTracking().Where(x => x.Produto.cd_ref.Trim() == prd.cd_grade.Trim().Substring(0, 6)).Select(x => x.Produto).FirstOrDefault();

    int lnFamiliaId, lnGrupoId, lnUnidadeId, lnMarcaId, lnLinhaId;

    RetGrupos(prd, out lnFamiliaId, out lnGrupoId, out lnUnidadeId, out lnMarcaId, out lnLinhaId, dbPdv);

    if (pAux == null)
    {
        if (prodAux == null)
            prodAux = RetProduto(dbPdv, prd.Produto, lnFamiliaId, lnGrupoId, lnUnidadeId, lnMarcaId, lnLinhaId);
        pAux = RetProdutoGrade(dbPdv, prodAux, prd);
        SetProdutoEan(dbPdv, prd, pAux);
        SetProdutoCf(dbPdv, prd, pAux);
        SetProdutoEstoque(dbPdv, lojaAux, prd, pAux);
        SetProdutoPreco(dbPdv, prd, pAux);
    }
    else
    {
        AtuProdutoGrade(dbPdv, prd, pAux);
        AtuProduto(dbPdv, prd, pAux, lnFamiliaId, lnGrupoId, lnUnidadeId, lnMarcaId, lnLinhaId);
        AtuProdutoEan(dbPdv, prd, pAux);
        AtuProduto_Cf(dbPdv, prd, pAux);
        AtuProduto_Preco(dbPdv, prd, pAux);
        AtuProdutoEstoque(dbPdv, lojaAux, prd, pAux);
    }

    transaction.Commit();

    //Tentar Melhorar performance...
    if (i % 1000 == 0)
    {
        HabilitaDb(dbPdv);
        dbPdv.Dispose();
        dbPdv = GetDbPdv(pdvAux);
        DesabilitaDb(dbPdv);
    }
}  
}
    
asked by anonymous 04.09.2015 / 15:14

2 answers

7

The Entity Framework is not meant to insert 1 million records ... It has a limit in Internal Memory ... (index of it) AsNoTracking () will not solve ... I have already done the most complex tests possible.

For this type of insertion you go for a ORM smaller ... with less management and details ... in this case a MicroORM , such as Dapper , PetaPoco , etc.

If it is not feasible, use Bulk Insert , which is an ADO.net implementation for the entity framework. EntityFramework.BulkInsert

In my opinion, I recommend using ADO.net in these cases for those processes that require performance in a large amount of data transition. Getting context connection data from entity framework or config. Because you are not limited only to INSERT (bulkinsert), you can perform other operations. One business option, on a large system, is to go to NHibernate . From personal experience and tests performed.

If it's Entity Framework Core , see this link

    
04.09.2015 / 16:00
5

In this answer, I explain several alternatives you can use to improve performance , but I do not think even used together will solve your problem completely.

There are a few things that caught my attention in your code. For example:

if (dbPdv.Database.Connection.State != ConnectionState.Open)
   dbPdv.Database.Connection.Open();

This is not necessary. The Entity Framework itself is responsible for controlling the lifecycle of the connection.

This:

using (var transaction = dbPdv.Database.Connection.BeginTransaction(System.Data.IsolationLevel.ReadUncommitted))
{
    dbPdv.Database.UseTransaction(transaction);
    ...

It is also not in line with Entity Framework best practices (distributed transaction support). The correct thing would be to open a TransactionScope with option of ReadUncommited in IsolationLevel :

using (var scope = new TransactionScope(TransactionScopeOption.Required,
            new TransactionOptions()
            {
                IsolationLevel = IsolationLevel.ReadUncommitted
            }))
{
    ...

For your case, @ Daniloloko's answer is the way forward. The EntityFramework.BulkInsert NuGet package is here .

    
04.09.2015 / 16:11