Please upgrade here. These earlier versions are no longer being updated and have security issues.
HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.

Characters encoding in Vanilla Wordpress plugin

When I post in my Wordpress blog, automatically Vanilla create a discussion in the forum; all is ok but in Vanilla forum I see some strange characters in post title and summary: characters like é è à etc... are "transformed" in something like  Ã. (In italian language, these characters are heavily used)
Character encoding seems to be ok in Wordpress, in Vanilla and in MySQL database. Where is the problem? How I can resolve this issue?

P.S. sorry for my bad english!

«1

Comments

  • Welcome to the community!

    What version of Vanilla are you running?

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

  • I'm using Vanilla 2.0.18.8!

  • You can see an example here: goo.gl/CnNCe

  • This title looks fine in the Vanilla discussion db table?

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

  • Yes! In database all seems to be ok!!

  • What is the character encoding for WordPress (theme), Vanilla (theme) and your database? How about for your database table?

    There was an error rendering this rich post.

  • utf8_unicode_ci for WordPress and Vanilla! Database server have UTF-8 Unicode (utf8) character encoding. Could this be the problem? I think not.

  • if you make a comment directly you don't have this issue?

    if you are using that pluign it doesn't use api, it is purely scraping the content.

    grep is your friend.

  • if in that database it is ok. Then you need to ensure the connection has

    notice also the slug in the url, so this is not a client side issue.

    grep is your friend.

  • yes I think the problem is during scraping. It is multi-ibyte interpreted as single byte, this can also becuase by single byte functions. it is not to do with you database connection or after.

    Please double check it is ok in the database , I suspect not, I suspect it is entered as ù as in two bytes. This is useful information.

    grep is your friend.

  • With direct commenting there isn't any issue!

  • @x00 said:
    Please double check it is ok in the database , I suspect not, I suspect it is entered as ù as in two bytes. This is useful information.

    How I can check this? I don't understand your suspect!

  • look up DiscussionID 34 in the database look in the name and body field.

    grep is your friend.

  • Oh...nice! In body and name field the text are "strange"!

  • yep as I suspected.

    grep is your friend.

  • Is there a way to fix this problem?

  • x00x00 MVP
    edited June 2013

    there is a create a file conf/bootstrap.before.php (if ti doesn't already exist)

    <?php if (!defined('APPLICATION')) exit();
       function FetchPageInfo($Url, $Timeout = 0) {
          $PageInfo = array(
             'Url' => $Url,
             'Title' => '',
             'Description' => '',
             'Images' => array(),
             'Exception' => FALSE
          );
          try {
             $PageHtml = ProxyRequest($Url, $Timeout, TRUE);
             $Dom = new DOMDocument();
             @$Dom->loadHTML('<?xml encoding="UTF-8">'.$PageHtml);
             // Page Title
             $TitleNodes = $Dom->getElementsByTagName('title');
             $PageInfo['Title'] = $TitleNodes->length > 0 ? $TitleNodes->item(0)->nodeValue : '';
             // Page Description
             $MetaNodes = $Dom->getElementsByTagName('meta');
             foreach($MetaNodes as $MetaNode) {
                if (strtolower($MetaNode->getAttribute('name')) == 'description')
                   $PageInfo['Description'] = $MetaNode->getAttribute('content');
             }
             // Keep looking for page description?
             if ($PageInfo['Description'] == '') {
                $PNodes = $Dom->getElementsByTagName('p');
                foreach($PNodes as $PNode) {
                   $PVal = $PNode->nodeValue;
                   if (strlen($PVal) > 90) {
                      $PageInfo['Description'] = $PVal;
                      break;
                   }
                }
             }
             if (strlen($PageInfo['Description']) > 400)
                $PageInfo['Description'] = SliceString($PageInfo['Description'], 400);
    
             // Page Images (retrieve first 3 if bigger than 100w x 300h)
             $Images = array();
             $ImageNodes = $Dom->getElementsByTagName('img');
             $i = 0;
             foreach ($ImageNodes as $ImageNode) {
                $Images[] = AbsoluteSource($ImageNode->getAttribute('src'), $Url);
             }
    
             // Sort by size, biggest one first
             $ImageSort = array();
             // Only look at first 10 images (speed!)
             $i = 0;
             foreach ($Images as $Image) {
                $i++;
                if ($i > 10)
                   break;
    
                list($Width, $Height, $Type, $Attributes) = getimagesize($Image);
                $Diag = (int)floor(sqrt(($Width*$Width) + ($Height*$Height)));
                if (!array_key_exists($Diag, $ImageSort))
                   $ImageSort[$Diag] = $Image;
             }
             krsort($ImageSort);
             $PageInfo['Images'] = array_values($ImageSort);
          } catch (Exception $ex) {
             $PageInfo['Exception'] = $ex;
          }
          return $PageInfo;
       }
    

    This exactly the same as the core version. bar one difference.

    note the '<?xml encoding="UTF-8"> is key it forces UTF-8

    This will apply to new articles. You will have to edit the old articles manually.

    grep is your friend.

  • I don't know if this could be directly related, but with a custom bootstrap.before.php there is a chance to fix this issue also? https://vanillaforums.org/discussion/22090/auto-truncate-titles-of-discussions-created-by-embedded-comments-in-wordpress#latest

Sign In or Register to comment.