UTF-8 encoding in XML file to generate RSS

3

In a news site I want to generate an RSS page with the news of the day. The page is well generated, but if generating a news story with accents or special characters, the generation fails. In my code I have the following:

<?php

    header("Content-Type: application/rss+xml; charset=ISO-8859-1");

    $rssfeed = '<?xml version="1.0" encoding="ISO-8859-1"?>';
    $rssfeed .= '<rss version="2.0">';
    $rssfeed .= '<channel>';
    $rssfeed .= '<title>RSS feed</title>';
    $rssfeed .= '<link>http://www.xxxxxxxxxxxxx.pt</link>';
    $rssfeed .= '<description>RSS feed</description>';
    $rssfeed .= '<language>en-us</language>';
    $rssfeed .= '<copyright>Copyright (C) 2014 xxxxxxxxxxxxx.pt</copyright>';

    $data = date("Y-m-d");
    $query = "SELECT * FROM tbl_noticias WHERE data = '$data' ORDER BY id DESC";
    $result = mysql_query($query) or die ("Could not execute query");

    while($row = mysql_fetch_array($result)) {
        $title = $row['titulo'];
        $title = $title;
        $description = $row['intro'];
        $date = $row['data']." - ".$row['hora'];
        $link = "http://www.xxxxxxxxxxxxx.pt/detalhe.php?id=".$row['id'];
        $image = "http://www.xxxxxxxxxxxxx.pt/images/resize_listagem/".$row['foto'];

        $rssfeed .= '<item>';
        $rssfeed .= '<title>' . $title . '</title>';
        $rssfeed .= '<description>' . $description . '<![CDATA[<br><img src="' . $image . '" />]]></description>';
        $rssfeed .= '<link>' . $link . '</link>';
        $rssfeed .= '<pubDate>' .$date. '</pubDate>';
        $rssfeed .= '</item>';
    }

    $rssfeed .= '</channel>';
    $rssfeed .= '</rss>';

    echo $rssfeed;
?>

I have already tried changing the header to UTF-8 but without success. I have also tried utf8_encode and it has not. The tag is what causes the error to occur. Because the tag with special characters does not give error. The fields in the DB are saved as utf8_general_ci

    
asked by anonymous 20.05.2014 / 10:39

2 answers

1

Add this line before $query , mysql_query("SET NAMES 'utf8'");

header("Content-Type: application/rss+xml; charset=utf-8");

$rssfeed = '<?xml version="1.0" encoding="utf-8"?>';
$rssfeed .= '<rss version="2.0">';
$rssfeed .= '<channel>';
$rssfeed .= '<title>RSS feed</title>';
$rssfeed .= '<link>http://www.xxxxxxxxxxxxx.pt</link>';
$rssfeed .= '<description>RSS feed</description>';
$rssfeed .= '<language>en-us</language>';
$rssfeed .= '<copyright>Copyright (C) 2014 xxxxxxxxxxxxx.pt</copyright>';

$data = date("Y-m-d");
mysql_query("SET NAMES 'utf8'");
$query = "SELECT * FROM tbl_noticias WHERE data = '$data' ORDER BY id DESC";
$result = mysql_query($query) or die ("Could not execute query");

while($row = mysql_fetch_array($result)) {
    $title =  $row['titulo'];
    $title = $title;
    $description = $row['intro'];
    $date = $row['data']." - ".$row['hora'];
    $link = "http://www.xxxxxxxxxxxxx.pt/detalhe.php?id=".$row['id'];
    $image = "http://www.xxxxxxxxxxxxx.pt/images/resize_listagem/".$row['foto'];

    $rssfeed .= '<item>';
    $rssfeed .= '<title>' . $title . '</title>';
    $rssfeed .= '<description>' . $description . '<![CDATA[<br><img src="' . $image . '" />]]></description>';
    $rssfeed .= '<link>' . $link . '</link>';
    $rssfeed .= '<pubDate>' .$date. '</pubDate>';
    $rssfeed .= '</item>';
}

Reference:

20.05.2014 / 16:26
1

An initial idea would be to use htmlentities to avoid problems with encoding:

while($row = mysql_fetch_array($result)) {
    $title = htmlentities( $row['titulo'], ENT_COMPAT, 'utf-8' );

If this is not enough, you can force output to UTF-8 :

    $title = utf8_encode( htmlentities( $row['titulo'], ENT_COMPAT, 'utf-8' ) );

But in this case, you need to update these two lines to utf-8 also:

header("Content-Type: application/rss+xml; charset=UTF-8");
$rssfeed = '<?xml version="1.0" encoding="UTF-8"?>';

If you still have problems, you can use a CDATA in the title:

    $title = '<![CDATA['.utf8_encode( htmlentities( $row['titulo'], ENT_COMPAT, 'utf-8' ) ).']]>';;

This solution can be extended to other fields as needed.

    
20.05.2014 / 22:53