Read multiple .txt files and remove duplicate information

0

I have several .txt files with license plates and some are repeated in different files and I would like to remove all the repeated ones.

The first solution I thought was to make a comparison between all the files using PHP or NodeJS, but I was kind of off the back and I do not know if that would be the best solution.

Then I thought about playing everything in a DB just to deal with, because I need those registrations in .txt files, but I saw that the DB would get huge, with many registrations and maybe not very viable because it was hundreds of files and thousands of enrollments.

What would be the best solution to this problem? Can any of the above two solve?

    
asked by anonymous 19.12.2017 / 00:40

1 answer

1

You can use the file function to open the file and throw all of its contents into an array.

Then just use the array_unique function to remove the identical lines.

01.txt

valdeir
psr
naval
fuz. nav
valdeir
valdeir psr
stackoverflow

PHP Code

<?php

$files = glob("*.txt");

$content = [];

foreach($files as $file) {
    $content = array_merge($content, file($file, FILE_IGNORE_NEW_LINES));
}

$contentUniq = array_unique($content);

var_export($contentUniq);

Output

array (
  0 => 'valdeir',
  1 => 'psr',
  2 => 'naval',
  3 => 'fuz. nav',
  5 => 'valdeir psr',
  6 => 'stackoverflow',
)
    
19.12.2017 / 01:24