Problems with PHP regular expressions?

4

I'm picking up a .txt file and removing the letters and lines in white. Tá giving problem with the special character \t or \s it does not recognize.

The code below:

<?php

function pass1() {
    $treat = fopen ("C:\Users\Bridge\Downloads\D_lotfac\lott.txt", "r+w+");
    $treat1 = fopen ("C:\Users\Bridge\Downloads\D_lotfac\lott1.txt", "r+w+");

    while (!feof ($treat)) {
        $linha = fgets($treat,4096);
        $patterns = array();
        $patterns [0] = '/[(A-Z)i]*/';
        $patterns [1] = '/Â|Ã|Á|À|É|Ê|Í|Î|Ç|Ó|Õ|Ô|Ö|Ú|Û|Ü/';
        $patterns [2] ='/ã|â|à|á|é|ê|í|î|ç|ó|ô|ô|ö|ú|û|ü/';
        $patterns [3] = '/\t/';                 
        $patterns [4] = '/[(a-z)i]*/';
        $patterns [5] = '   ';

        $replacements = array();
        $replacements[] = '';
        $linha = preg_replace($patterns, $replacements, $linha);
        fwrite ($treat1, $linha); 

        printf($linha . "<br>");
        }
}

You are generating the file lott1.txt correctly, only tabs is not being removed nor the spaces ( 2x , 3x , etc). I already put the tab literally "" or put \t inside the $pattern[] array. Does not delete.

What's the problem?

    
asked by anonymous 13.02.2017 / 19:03

1 answer

2

First

  

\s Not "space" !!

You can see what \s means here .

Problem

  • From what I noticed you also want to capture accented characters. For this I use the modified u , approached here .
  • To capture both upper and lower case you can use [a-zA-Z] , [[:alpha:]] or [a-z] with i modifier.
  • If you want to remove all tabs and spaces you can do [\t ]+ .

Solution

In summary your pattern would be:

~[a-z\t ]+~iu

Note

  • [(A-Z)i] - if your intention was to set a group with A-Z it does not occur within [ ... ] , the parentheses being interpreted literally.
14.02.2017 / 13:25