Problem sending images to a folder with accented names

0

I'm sending an image to a folder and saving only its path in the database. If for example the name that I attribute the image does not have accents works perfectly, only when it has accents does not work. By saving to the database and showing on the website shows everything correct.

When I save to the folder, I can not fetch its image because the path is for example "Products / zé" and the name appears inside the z & A

Shipping code for database and folder:

    if(isset($_POST['upload']) && isset($_FILES['file-image'])) {
    $filetmp = mysqli_real_escape_string($_FILES["file-image"]["tmp_name"]);
    $filename = mysqli_real_escape_string($dbc, $_FILES["file-image"]["name"]);
    $filetype = mysqli_real_escape_string($dbc, $_FILES["file-image"]["type"]);
    $ProdName = mysqli_real_escape_string($dbc, $_POST['nameProduct']);
    $ProdPrice = mysqli_real_escape_string($dbc, $_POST['productPrice']);
    $ProdDescr = mysqli_real_escape_string($dbc, $_POST['descrProduct']);

    $filepath = "Products/" . $ProdName ;
    $info = getimagesize($filetmp);
    if ($info == FALSE or (empty($ProdName) or empty($ProdPrice) or empty($ProdDescr))) {
        alertError();
    }elseif ($info == FALSE and (empty($ProdName) or empty($ProdPrice) or empty($ProdDescr))) {
        alertError();
    }elseif ($info == TRUE and (empty($ProdName) or empty($ProdPrice) or empty($ProdDescr))) {
        alertError();
    }elseif ($info == TRUE and (!(empty($ProdName) or empty($ProdPrice) or empty($ProdDescr)))) {
        move_uploaded_file($filetmp, $filepath);
        $result = mysqli_query($dbc, "INSERT INTO images (img_name,img_path,img_type) Values('$ProdName','$filepath','$filetype')") or die(errorAdmin());
        $last_id = $dbc->insert_id;
        $insertProduct = mysqli_query($dbc, "INSERT INTO products (name_Product,prod_description,prod_price,img_id)
                                        Values('$ProdName','$ProdDescr','$ProdPrice','$last_id')") or die(errorAdmin());
        mysqli_query($dbc, "SET NAMES 'utf8'") or die(mysqli_error($dbc));
        if ($insertProduct == TRUE and $result == TRUE) {
            ?>
            <script type="text/javascript">
                swal({
                    title: 'Good Job',
                    text: 'Produto Criado Com Sucesso',
                    type: 'success',
                    confirmButtonText: 'Feito'
                });
            </script>
            <?php
        }
    }
    mysqli_close($dbc);
}
    
asked by anonymous 28.09.2016 / 04:05

1 answer

4

Using accents in filenames is a bit tricky and will hardly be a portable solution.

Operating systems handle character encoding in different ways.

Linux

Linux uses binary for file names. This means that it does not require a specific encoding to save the file name. For example if we create two files with different encodings in their names, but that are equivalent (we'll talk more about it):

$ echo "foo" > $'z\xc3\xa9.txt'
$ echo "bar" > $'ze\xcc\x81.txt'

When executing the command ls , we will have the following output:

$ ls 
zé.txt zé.txt

This means that the program that uses the filesystem needs to worry about coding and normalization of characters, because if there is more than one file with the same name, but with different normalizations, which file should the program consider correct? Outside security issues.

OSX

OSX already uses UTF-16 to encode file names. This is done transparently to the user. That is, if your program saves the file name using UTF-8, OSX will translate it to UTF-16.

Another point to consider when using OSX is how it normalizes characters. The Unicode specification defines something called equivalence and normalization . This concept defines how the character combination is made in different types of normalizations (NFD, NFC, NFKD, NFKC) for the same character. To illustrate this concept, just save a file with the correct sequence in UTF-8 of the character is ( \xC3\xA9 ):

<?php
file_put_contents("z\xC3\xA9.txt","xyz");
?>

And check what was actually recorded:

<?php
list($file) = glob('z*');
echo urlencode($file) . "\n";
?>

We will have the following output:

ze%CC%81.txt

OSX has written e0xCC81 instead of 0xCCA9 . This shows us that OSX uses a normalization that uses three bytes, the e and two bytes that combined result in the acute accent. These three bytes together form the letter é .

Windows

Windows also uses UTF-16 (using NTFS), with behavior similar to OSX, regarding the transparent translation of characters. In relation to normalization, it behaves differently than OSX, if we run the same code that we run on OSX we will see the correct characters in the output.

Conclusion

We have here two equivalent ways of writing the same letter in Unicode. We have two more, which are beyond scope here. The fact is that if you really want to work with accents on behalf of files, you'll need some care and probably use an intermediary to save those files.

PHP's intl extension offers a class called Normalizer that allows you to control the normalization of characters. This is an interesting option if you want to intermediate these encodings.

My suggestion is to translate the accented characters to pure ASCII whenever you can.

    
28.09.2016 / 21:36