PHP file validation

1

I have a term insertion system via CSV files, eg:

09999999;José
08888888;Maria

I get this file moved to the server, and then I open that file to insert into the database. My problem is that I need to validate the insert, I can not insert repeating phones into the same file, and for this I use this code for this:

$valida1 = array_search($numero1, $validaNumeroLista);

if (empty($valida1) ) 
{
    array_push($validaNumeroLista, $numero1);
}

After this I insert into the database, the problem is that the insertion time has increased a lot.

For example:

Before entering this validation, a file with up to 20 thousand lines would take around 5 to 7 seconds. Now, with 1 thousand lines it takes more than 2 minutes. Over 2 thousand lines is impossible to insert.

Do you have any tips on how to improve this performance?

    
asked by anonymous 09.08.2017 / 14:49

1 answer

0

Use the array as a hash table instead of a normal array. The way you're using, with array_push , is getting all the sequential indexes, something like this:

[0] => "09999999",
[1] => "08888888",
[2] => "07777777"

This implies that when you make array_search in this array, it will have a cost of O (n). This means that on average you have to go through half of the array to find the element, so it grows a lot with the size of the Array.

Instead use the phone number as a key, like this:

if (!array_key_exists($numero1, $validaNumeroLista){
    $validaNumeroLista[$numero1] = true;
}

Notice how the existence test was done with the array_key_exists function, which checks to see if a given key exists in the array. If it does not exist we put a new entry for this same key with the value true . The value could be any other because we are only interested in registering that key as existing.

Another way to test if the key does not exist would be:

if (!isset($validaNumeroLista[$numero1])){
Registering for the keys causes the array to be used as a hash table, and then looks like this:

["09999999"] => true,
["08888888"] => true,
["07777777"] => true

In this case, the key search is cost constant O (1), so it will be much faster

    
09.08.2017 / 15:02