mb_convert_encoding vs utf8_encode ()

1

I did an update of php on the server and I identified that they were not being encoded in the utf-8 pattern, the first thing I checked was the connection class I use, which in this case is adodb .

In my connection I perform a process of converting columns that are in another format to utf-8 like this:

 $dados[$i]  = mb_convert_encoding($dados[$i],"UTF-8");

No php 5.2 was working normally, after updating to php 5.6.9 it "stopped working", checked on your documentation is the same is not depreciated.

To solve the problem I used utf8_encode , like this:

 $dados[$i]  = utf8_encode($dados[$i]);

Detail, I use adodb + firebird .

Questions:

1 ° What is the difference between these functions and why mb_convert_encoding is no longer working.

2 ° The return of my database is in ASCII Is it possible to simplify this coding question utf-8 to ASCII and ASCII to utf-8 ?

My php.ini is set by default default_charset = "UTF-8" // linha 680

Testing:

$f[$i] = mb_detect_encoding($f[$i]); //  ASCII

$f[$i] = mb_convert_encoding($f[$i], "HTML-ENTITIES", "UTF-8");
SUÉLLEM // Entrada
SU�LLEM // Saida

$f[$i] = mb_convert_encoding($f[$i], "ISO-8859-1", "UTF-8");
SUÉLLEM // Entrada
SU?LLEM // Saida

$f[$i] = mb_convert_encoding($f[$i], 'UTF-8', 'ISO-8859-1');
SUÉLLEM // Entrada
SUÉLLEM // Saida

In the third test, how strange!

Example of method _fetch of adodb :

   function _fetch() {
     $f = @ibase_fetch_row($this - > _queryID);
     if ($f === false) {
       $this - > fields = false;
       return false;
     }
     // OPN stuff start - optimized
     // fix missing nulls and decode blobs automatically

     global $ADODB_ANSI_PADDING_OFF;
     //$ADODB_ANSI_PADDING_OFF=1;
     $rtrim = !empty($ADODB_ANSI_PADDING_OFF);
     for ($i = 0, $max = $this - > _numOfFields; $i < $max; $i++) {
       if ($this - > _cacheType[$i] == "BLOB") {
         if (isset($f[$i])) {
           $f[$i] = $this - > connection - > _BlobDecode($f[$i]);
         } else {
           $f[$i] = null;
         }
       } else {
         if (!isset($f[$i])) {
           $f[$i] = null;
         } else if ($rtrim && is_string($f[$i])) {
           $f[$i] = rtrim($f[$i]);
         }
       }

       $f[$i] = utf8_encode($f[$i]);

     }
     // OPN stuff end

     $this - > fields = $f;
     if ($this - > fetchMode == ADODB_FETCH_ASSOC) {
       $this - > fields = $this - > GetRowAssoc(ADODB_ASSOC_CASE);
     } else if ($this - > fetchMode == ADODB_FETCH_BOTH) {
       $this - > fields = array_merge($this - > fields, $this - > GetRowAssoc(ADODB_ASSOC_CASE));
     }
     return true;
   }
    
asked by anonymous 03.11.2015 / 19:38

1 answer

5

Quickly and simply:

mb_convert_encoding converts an X encoding to a Y encoding.

utf8_encode encodes string ISO-8859-1 to UTF-8 .

Note that the difference between the two is very wide.

5.5.9-1ubuntu4.11 A small example of coding conversion UTF-8 to HTML-ENTITIES with mb_convert_encoding :

$str = 'É assim que você faz, na programação';

$converted = mb_convert_encoding($str, "HTML-ENTITIES", "UTF-8");

var_dump($converted, $str);

The output is:

string(63) "&Eacute; assim que voc&ecirc; faz, na programa&ccedil;&atilde;o"

string(40) "É assim que você faz, na programação"

Default_charset configuration

I do not know if this is relevant, but the setting of my default_charset in php.ini influenced the time to display the previous test result.

See what happened to default_charset set to ISO-8859-1 :

ini_set('default_charset', 'ISO-8859-1');

$str = 'É assim que você faz, na programação';

$converted = mb_convert_encoding($str, "HTML-ENTITIES", "UTF-8");


var_dump($converted, $str);

The output was:

string(63) "&Eacute; assim que voc&ecirc; faz, na programa&ccedil;&atilde;o"
string(40) "É assim que você faz, na programação"

So, to see if this is the problem presented in the first question, then try setting your default_charset like this:

ini_set('default_charset', 'UTF-8');

Note: This is the PHP version where I tested: 5.5.9-1ubuntu4.11

    
04.11.2015 / 15:21