How to validate the structure of a text file in PHP?

4

I created an admin space where users deposit .txt files into my FTP .

I would like to impose a format. Example The whole file must contain two three columns separated by a semicolon.

Example :

valid_file.txt

  

name, height, name, height
height

Any file that does not respect this format should be ignored.

Example :

Invalid_file.txt

  

home, age, city, height of village, age, father, children
age, height

    
asked by anonymous 27.11.2017 / 15:13

2 answers

8

I have tried in many ways to create something efficient, however no way could validate everything, which ended up having to opt for while even with fgets (or fgetcsv ).

This format you want is basically the CSV, however this file format is not so advanced, it is impossible to limit the number of columns in a "practical" way, an example of checking would be this:

<?php
function validaCSV($arquivo, $limite = 3, $delimitador = ';', $tamanho = 0)
{
    $handle = fopen($arquivo, 'rb');
    $valido = true;

    if ($handle) {
        while (feof($handle) === false) {
            $data = fgetcsv($handle, $tamanho, $delimitador);

            if ($data && count($data) !== $limite) {
                $valido = false; //Seta false
                break;
            }
        }

        fclose($handle);
    } else {
        $valido = true;
    }

    return $valido;
}

Usage example (expected column pattern is 3 ):

var_dump(validaCSV('arquivo.txt')); //Checa se todas linhas tem 3 colunas
var_dump(validaCSV('arquivo.txt', 5)); //Checa se todas linhas tem 5 colunas

Will return true if valid, otherwise return false

If you want to read the file, if it is valid, use the following:

  

To avoid memory peaks, if the file is invalid, I created two whiles, it is a bit slower but it will not consume both the server (in case of invalid files)

     

Note: In the example I used yield so you can use it within a while its own

function lerCSV($arquivo, $limite = 3, $delimitador = ';', $tamanho = 0)
{
    $handle = fopen($arquivo, 'rb');

    if ($handle) {
        while (feof($handle) === false) {
            $data = fgetcsv($handle, $tamanho, $delimitador);

            if ($data && count($data) !== $limite) {
                throw new Exception('O numero de colunas excedeu o limite de ' . $limite);
            }
        }

        //Volta o ponteiro para o inicio do arquivo para poder usar novamente o while
        rewind($handle);

        while (feof($handle) === false) {
            $data = fgetcsv($handle, $tamanho, $delimitador);

            if ($data) { //Impede linhas vazias de retornarem false como valor
                yield $data;
            }
        }

        fclose($handle);
    } else {
        throw new Exception('Arquivo inválido: ' . $arquivo);
    }
}

Example usage:

foreach(lerCSV('a.csv') as $linha) {
    var_dump($linha);
}

It will issue Exception if the file is invalid / non-existent or if the line number is not determined in the function (default is 3 )

Extra (with SplFileObject )

I was looking at the situation of the file being opened in the case of yield , because if there is a break; in foreach , the file may not be closed, When the class is "destroyed" (will occur SplFileObject (internal) of the class), then at this point the file will be "freed", as explained in this question:

The SPL version looks like this:

<?php

function SplLerCSV($arquivo, $limite = 3, $delimiter = ';', $enclosure = '"', $escape = '\')
{
    $file = new SplFileObject($arquivo);
    $minCol = $limite - 1;

    while ($file->eof() === false) {
        $data = $file->fgetcsv($delimiter, $enclosure, $escape);

        if (isset($data[$minCol]) && count($data) !== $limite) {
            throw new Exception('O numero de colunas excedeu o limite de ' . $limite);
        }
    }

    //Volta o ponteiro para o inicio do arquivo para poder usar novamente o while
    $file->rewind();

    while ($file->eof() === false) {
        $data = $file->fgetcsv($delimiter, $enclosure, $escape);

        if (isset($data[$minCol])) { //Impede linhas vazias de retornarem [ 0 => NULL ] como valor
            yield $data;
        }
    }
}

//Usando
foreach (SplLerCSV('a.csv') as $value) {
    var_dump($value);
}
    
27.11.2017 / 16:00
3

Validating the file structure (.txt)

PHP

<?php
if (isset($_POST['botao'])) {
    $invalido="false";
    //Receber os dados do formulario
    $arquivo_tmp = $_FILES['arquivo']['tmp_name'];

    //ler todo o arquivo para um array
    $dados = file($arquivo_tmp);

    //percorrer o array para verificar a estrutura de cada linha
    foreach($dados as $linha){
        //deve conter 3 nomes de colunas separadas por ; (ponto e virgula)
        if (count(array_filter(explode(';', $linha))) !== 3){
            echo "Nananinanão, estrutura em desacordo";
            //inviabiliza o upload
            $invalido="true";
            //finaliza a execução do foreach na primeira ocorrência inválida.
            break;
        }
    }

    if($invalido=="false"){
        echo "estrutura ok";
        //upload aqui
    }

}
?>

Form used in the online test.

<form method="POST" action="" enctype="multipart/form-data">
    <label>Arquivo</label>
    <!--Campo para fazer o upload do arquivo com PHP-->
    <input type="file" name="arquivo"><br><br>          
    <button type="submit" name="botao">Upload</button>
</form>

online test here

    
27.11.2017 / 19:06