Users running a non-download version of Vanilla (pulled from github), on branch release/2019.016 or master from the last 2 weeks should upgrade to release/2019.017 or latest master for security reasons. Downloaded official open sources releases are not affected.
Vanilla 1 is no longer supported or maintained. If you need a copy, you can get it here.

Character encoding problem after moving from Vanilla 1 to Vanilla 2

npnp
edited January 2011 in Vanilla 1.0 Help
I migrated my forum from Vanilla 1 to Vanilla 2. I used the vanilla2export.php to move the data over. Previous non-ASCII data is no longer displaying correctly. It showed fine in Vanilla 1.

My tables are UTF-8. I can see the correct data when using an SQL client in UTF-8 mode. The data shows up the same way in the Vanilla 2 tables as it did in the Vanilla 1 tables. Additionally, the non-forum part of my site also does UTF-8 correctly.

When I add a new comment that includes non-ASCII characters, it shows up correctly on the forum, but incorrectly in the SQL client.

My pages are being with HTTP headers specifying UTF-8. I've seen this page, which says that there are various settings for character encoding. It's for Vanilla 1, but looking at my config files, I see that I have no character-encoding-related settings in config.php, and config-default.php has:

$Configuration['Database']['CharacterEncoding'] = 'utf8'; $Configuration['Database']['ExtendedProperties']['Collate'] = 'utf8_unicode_ci'; $Configuration['Garden']['Charset'] = 'utf-8';

What do I need to do to make this work right?

Comments

  • After some more digging, I found that despite the fact the tables are UTF-8, the Vanilla 1 connections were being made in latin1. The comments from Vanilla 2 are encoded correctly.

    My MySQL client was connecting with the DB's default encoding, which was latin1. If I ran "set names utf8" then I saw Vanilla 2's stuff showing up correctly, and Vanilla 1's stuff incorrectly. My non-Vanilla code also had the same problem.

    To fix, I made this PHP script to convert the old data. The IDs used represent the cut off between the Vanilla 1 data and the Vanilla 2 data.

    <?php
    # make iconv shut up
    #error_reporting(E_WARNING);

    header( 'Content-Type: text/html; charset=UTF-8' );

    $db = mysql_connect('localhost', 'root', 'password');
    mysql_query("USE mydb");
    mysql_query("SET NAMES latin1");
    #$result = mysql_query("SELECT DiscussionID, Name FROM GDN_Discussion WHERE DiscussionID <= 24460");
    $result = mysql_query("SELECT DiscussionID, Body FROM GDN_Discussion WHERE DiscussionID <= 24460");
    #$result = mysql_query("SELECT CommentID, Body FROM GDN_Comment WHERE CommentID <= 50053");
    $stuff = array();
    while ($row = mysql_fetch_row($result)) {
    $stuff[$row[0]] = $row[1];
    }
    mysql_query("SET NAMES utf8");
    #$result = mysql_query("SELECT DiscussionID, Name FROM GDN_Discussion WHERE DiscussionID <= 24460");
    $result = mysql_query("SELECT DiscussionID, Body FROM GDN_Discussion WHERE DiscussionID <= 24460");
    #$result = mysql_query("SELECT CommentID, Body FROM GDN_Comment WHERE CommentID <= 50053");
    while ($row = mysql_fetch_row($result)) {
    if ($stuff[$row[0]] == $row[1]) {
    unset($stuff[$row[0]]);
    #echo $row[0].'<br>';
    }
    }

    foreach ($stuff as $key => $value) {
    $value = iconv('UTF-8', 'UTF-8//IGNORE', $value);
    # echo $key.' "'.htmlspecialchars($value).'"
    '; # mysql_query(sprintf("UPDATE GDN_Discussion SET Name = '%s' WHERE DiscussionID = %s", mysql_real_escape_string($value), mysql_real_escape_string($key))); mysql_query(sprintf("UPDATE GDN_Discussion SET Body = '%s' WHERE DiscussionID = %s", mysql_real_escape_string($value), mysql_real_escape_string($key))); # mysql_query(sprintf("UPDATE GDN_Comment SET Body = '%s' WHERE CommentID = %s", mysql_real_escape_string($value), mysql_real_escape_string($key))); } mysql_close($db); ?>
Sign In or Register to comment.