HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.

Bad characters

edited July 2011 in Localization
Hello again. Now i have problem with character showing.
I set collation of DB which i use in phpMyAdmin to cp1250_czech_cs.
And in config-.defaults.php i set:

1002 => "set names 'cp1250'"
['CharacterEncoding'] = 'cp1250';
['ExtendedProperties']['Collate'] = 'cp1250_czech_cs';
['Charset'] = 'cp1250';
['LocaleCodeset'] = 'CP-1250';

i need to enable characters like š,č,ť,ž,ý,ľ and so one. In menu (dashboard, discussions, ...), which is translated to slovak language, is it shown correctly. But when i post comment, it shows bad characters, like ŤÄľĹľĹĄÄŤĂ˝ĹĄ... what should i do?
Tagged:

Comments

  • lucluc ✭✭
    You should have left it at utf-8.
  • when i change it to utf-8, nothing changes... still bad characters...
  • lucluc ✭✭
    Did you convert the old ones?

    You'd better start over, this forum instance is using v2 and Cyrillic and Japanese, Chinese etc... is displayed OK with it, so just a few accents are not an issue.
  • candymancandyman ✭✭
    edited November 2011
    I have the same problem with my Italian forum.
    The original phpBB collation was latin1 (swedish_ci), when I installed Vanilla2 I've left the default utf-8: now some characters like é è and à doesn't show properly.
    Where I failed?
  • candymancandyman ✭✭
    edited June 2013

    Any suggestion? That problem (and others...) stopped me from migrating...

  • @candyman: Does it only happen on the front end of the site, in the admin dashboard, or both?

    Add Pages to Vanilla with the Basic Pages app

  • In the front side. I'm talking' about the posts content and the thread names.
    I'll give a look to the admin: I don't remember.
    I think the problem was the original swedish_ci phpBB collation.

  • tom762tom762 New
    edited August 2013

    I have a similar problem. Some parts of vanilla correctly display characters with accents, others don't. The ones that don't, show "?" instead. This mainly happens in dialog boxes. For example, when starting a new conversation, all characters are displayed ok, but when trying to delete a conversation, only question marks display instead of these characters.

    I've been able to fix some of these errors with replacing characters like "č" to "č" in definitions.php. Funny thing is somewhere it works, elsewhere it doesn't.

    The forum was started as a migration from PHPBB3 and after investigating I noticed a weird thing in PHPMyadmin (see attachment). I have no idea what this means and how to fix it. I'm no expert so please be gentle.

    Posts content is displayed correctly.

    Thanks.

  • @tom762 I believe that area you have circled is the database collation. You can see that the tables are using utf8_unicode_ci which should support your characters.

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

  • @hgtonight

    All characters are displayed ok in the content, it's just some dialog boxes that don't play along.
    Funny thing is that translations of the same words or strings in "definitions.php" get displayed differently in different dialog boxes. I really don't know where to start looking for the root of the problem.

  • @tom762 said:
    hgtonight

    the same words or strings in "definitions.php" get displayed differently in different dialog boxes.

    Sounds like a html encoding problem. Could you show the head of two pages, one with the characters as you expect them to see and one that doesn't show them right?
    All pages are rendered by the same master template, but maybe one plugin changes something?

  • tom762tom762 New
    edited August 2013

    As I said, it's not a problem between pages, the problem is in dialog boxes on the same page. Content (stuff people write in comments, conversations) on all pages looks ok.

    For easier understanding I'm attaching two screenshots. First one is after clicking "Start a new conversation", second after clicking "Delete conversation".

  • Are both definitions in the same file?
    Do you have a hex editor where you can proof that both characters are really the same?
    Could you cut out a č from a working definition and do a search and replace for the č in a non working string?

  • Both definitions are from the same file. The "č" character works as it should in some parts and doesn't in others.

  • Sorry, I should have explained why I asked that questions so that you can decide if you've already considered that possibility.

    1. When looking at your screenshot, I see that the text for "Are you sure you want to do that?" and "Cancel" is compromised. This text is already in your source when you load the page (open up the source view and search for id="ConfirmText" and id="Cancel" to prove that
    2. The button for "Start a new Conversation" shows, that the special character could be rendered by your browser

    => it is very unlikely that one and the same character could be rendered and could not be rendered on the same page. That would be a major browser bug and I do not believe that this is the case.

    => so, if it is not the browser, it is the character. e.g. look at this: "А" and "A". They are not the same! Mark the first and search for it on this page, then do the same with the second ;-)

    Although there are characters that look the same, they might not behave the same and that is a probable solution for your problem.

    So you should be 100% sure that the characters you are looking at, really are the same. That could be best done by looking at their hex code. If you do not have the possibility, just do it by try and error: cut a working "c" and paste it over a non working. Or you could paste the definition for "Start a new Conversation" (working) and "Are you sure you want to do that?" (not working) here so that we can take a look at that.

  • You were partly right. After I tested many combinations of characters I changed most of characters like "č" to č or appropriate, and obviously left some "č" characters inside.

    But there was a reason for that. Like I said before, some characters do get rendered differently from one dialog box to another. For example there is definitely the same character č in the definitions file for the word "Začni nov pogovor" (in the first dialog box) and "Prekliči" (in the second dialog box) and the first is rendered ok, but the second is rendered literally: Prekliči

    I have no idea what's going on.

    Btw, I do my editing in Notepad++ with UTF-8 encoding.

  • I've done some search on google and found someone talking about a similar problem that was caused by a json call which quoted the html entity so that it wasn't shown correctly.
    I've tried to find out if that happened here but I was not able to figure out what's going on in the background. You might need the help of one of the pros here.

    If that is really caused by the javascript than you'd have the problem whenever the text is shown due to a javascript call. And if that iss true, you've found a bug.

    @hgtonight gave me a JS tipp recently. @hgtonight : can you see if this might be due to a json encapsulation?

  • Need a link to the forum @tom762. Or a copy of your locale files.

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

Sign In or Register to comment.