How to optimize memory consumption using Doctrine

0

I need to read and save in the database two files the first (smaller) has about 49,000 rows (each row will be make a record). I parse this file by instantiating the object and then saving that object to the database using Doctrine. I can save about 13,000 records, after which the script dies due to memory consumption.

Although the file is large having more than 100MB I do not load the entire file into memory, I read one line at a time. The complete process of reading the file, instance creation, parse, and loading all other data for the file to work (with the exception of Doctrine) never consumes more than 30 MB (consumes less than 2.5% of ram). Only when you run Doctrine does the memory consumption increase absurdly.

I know that it is possible to increase the amount of memory to be used by changing php.ini directives or using ini_set, but I would not like that solution. The object just needs to be written to the database and not be in memory.

I tried to retrieve the elements and remove them from memory using detach:

private function clearBatch() {
        $id = $this->last->getId() - 1000;

        $query = $this->getDoctrine()->getEntityManager()->createQuery(sprintf('SELECT c FROM {MyObjet} c WHERE c.id >= \'%d\'', $id > 0 ? $id : 0));
        $cache = $query->getResult();
        foreach ($cache as $c) {
            $this->getDoctrine()->getManager()->detach($c);
        }
        $query->free();
        $this->getDoctrine()->getManager()->clear();
        $this->getDoctrine()->resetManager();
        gc_collect_cycles();
    }

But the script always dies, I believe that the doctrine is every insert storing it in memory, but I just need to feed the database with this information. This does not need to be in memory.

Every 1000 records I enter (persist) I run clearBatch

public function onProcess($c){
      $this->getDoctrine()->getManager()->persist($c);
      $this->count++;
      $this->last = $c;
      if ($this->count % 1000 == 0) {
           $this->getDoctrine()->getManager()->flush();
           $this->clearBatch();
      }
}

I even tried calling the Garbage Collector.

If I do not store in persistence I can read all the records. And my memory consumption is constant (it reaches that peak, right at the beginning of the script and stays to the end) 2.4% (it's a test-only VMWare, has 1GB of ram), the script dies when it reaches ~ 14%.

The onProcess function is called in the file reading loop.

    
asked by anonymous 09.02.2015 / 18:59

1 answer

1

The problem is precisely in the size of the file, only counting by the number of lines the file must have more than 50M. So you do not have much to do your memory consumption will be high anyway.

I suggest two things or you make a script just to make this processing in pure PHP, the consumption will still be great but then you can optimize better.

If you want to continue using doctrine, try first to leave the entityManager in a global access property / variable ($ this->), start in the constructor for example, instead of always giving this $ getDoctrine- > getManager ()

Another thing tries to use only the clear onProcess function

public __contruct() {
    //enable gc
    gc_enable();

    //get entityManager
    $this->em = $this->getDoctrine()->getManager();

    //disable log (salva um pouco de memoria ao evitar que tudo vá para o log)
    $this->em->getConnection()->getConfiguration()->setSQLLogger(null);
}

public function onProcess($c) {

    $this->em->persist($c);
    $this->count++;

    unset($c);

    //diminui um pouco o batch para evitar varios statements
    if ($this->count % 100 == 0) {
         $this->em->flush();
         $this->em->clear(); //detach all objetcs

         gc_collect_cycles();
    }
}

Your clearBatch method will only slow the process down.

    
10.02.2015 / 12:02