Knowledge Base

How to convert a MySQL database to UTF-8 encoding

This article describes how to convert a MySQL database's character set to UTF-8 encoding (also known as Unicode). The UTF-8 character encoding set supports many alphabets and characters for a wide variety of languages.

Although MySQL supports the UTF-8 character encoding set, it is often not used as the default character set during database and table creation. As a result, many databases use the Latin character set, which can be limiting depending upon the application.

Determine the current character encoding set

To determine which character encoding set a MySQL database or table is currently using:

  1. Log in to your Webhost.Berlin SSH account.
  2. At the command line, type the following command, replacing USERNAME with your username:
    mysql -u USERNAME -p
  3. At the Enter Password prompt, type your password. When you type the correct password, the mysql> prompt appears.
  4. To display the current character encoding set for a particular database, type the following command at the mysql> prompt. Replace DBNAME with the database name:

    SELECT default_character_set_name FROM information_schema.SCHEMATA S WHERE schema_name = "DBNAME";
  5. To display the current character encoding set for a particular table in a database , type the following command at the mysql> prompt. Replace DBNAME with the database name, and TABLENAME with the name of the table:

    SELECT CCSA.character_set_name FROM information_schema.`TABLES` T,information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "DBNAME" AND T.table_name = "TABLENAME";
  6. To exit the mysql program, type \q at the mysql> prompt.

Convert the character encoding set to UTF-8

Make sure that you back up the database before you start this procedure! You can back up a MySQL database using cPanel, phpMyAdmin, or the mysqldump program.

To convert the character encoding set to UTF-8:

  1. Log in to your Webhost.Berlin SSH account.
  2. Create a text file named .my.cnf. To do this, you can use a text editor such as Vim or Nano. This procedure shows how to use Nano. At the command line, type the following command:
    nano .my.cnf
  3. Add the following lines to the file, replacing USERNAME with your username and PASSWORD with your password (make sure the password is enclosed in quotation marks):

    [client]
    user=USERNAME
    password="PASSWORD"
  4. When the edits are complete, press Ctrl+X, type y to save the file, and then press ENTER.
  5. To change the character set encoding to UTF-8 for all of the tables in the specified database, type the following command at the command line. Replace DBNAME with the database name:

    mysql --database=DBNAME -B -N -e "SHOW TABLES" | awk '{print "SET foreign_key_checks = 0; ALTER TABLE", $1, "CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1; "}' | mysql --database=DBNAME
  6. After the command finishes, type the following command to start the mysql program:

    mysql
  7. To change the character set encoding to UTF-8 for the database itself, type the following command at the mysql> prompt. Replace DBNAME with the database name:

    ALTER DATABASE DBNAME CHARACTER SET utf8 COLLATE utf8_general_ci;
  8. To exit the mysql program, type \q at the mysql> prompt.
  9. To delete the .my.cnf file, type the following command at the command line:

    rm .my.cnf
  10. To verify that the character set encoding is now set to UTF-8, follow the steps in the Determine the current character encoding set procedure above.

More Information

For more information about UTF-8 and Unicode, please visit http://en.wikipedia.org/wiki/UTF-8.