HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.

Solution to Support Additional Unicode Characters

AnonymooseAnonymoose ✭✭
edited October 2012 in Feedback

Problem: MySQL databases set to utf8 can only store characters from the Unicode BMP (Basic Multilingual Plane). Therefore characters outside of the BMP can not be shown.

This is important for support of: Unicode Plane 2, used for Chinese/Japanese/Korean Ideographs, mostly CJK Unified Ideographs, that were not included in earlier character encoding standards, and Unicode Plane 1 used for Ancient Scripts, Symbols and Emoji.
See: http://en.wikipedia.org/wiki/Plane_(Unicode)

Solution: use the utf8mb4 character set. utf8mb4 is a superset of utf8.
See: http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html

Discussed here:
http://stackoverflow.com/questions/10392079/emoticons-from-iphone-to-python-django

Tagged:

Comments

  • Nice detective work. I hope the decision to switch to this type of UTF8 is an easy one for the management.

  • AnonymooseAnonymoose ✭✭
    edited October 2012

    @matt said:
    Nice detective work. I hope the decision to switch to this type of UTF8 is an easy one for the management.

    Thanks. I hope that this can be done otherwise it will be necessary to go in and change it manually.

    PS: I didn't notice your mention of utf8mb4 in August in the other post.

Sign In or Register to comment.