Jan
19
2009

Converting MySQL Database to UTF-8

When we were building a Wordpress website for a Chinese community we found even the configuration in Wordpress defined default charset as 'UTF-8', becase the datbase tables/fields are by default not in 'utf8_unicode_ci', it won't allow users to post in Chinese the characters all come up as ? marks.

So we decide to alter all the tables and fields for Wordpress to UTF-8, which would fix the problem. However it's not easy to change all the fields one by one from one charset and collation to another. Then we have written the following php script to help converting the tables and all text fields to UTF-8.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

<?php
//mysql database name
$db_name = "my_test";
 
//db connection
$dc = mysql_connect("localhost", "root", "");
mysql_select_db($db_name, $dc);
 
//set database charset and collation
$sql = "ALTER DATABASE `".$db_name."` DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci";
mysql_query($sql);
 
//set tables charset and collation
$sql = "SHOW TABLES";
$rs = mysql_query($sql);
$tables = array();
while($r = mysql_fetch_array($rs)){
	$sql = "ALTER TABLE `".$r[0]."`  DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci";
	mysql_query($sql);
	$tables[] = $r[0];
}
 
//set table rows charset and collation
foreach($tables as $tab){
	$sql = "SHOW FULL COLUMNS FROM `".$tab."`";
	$rs = mysql_query($sql);
	while($r = mysql_fetch_array($rs)){
		if(!empty($r['Collation']) && $r['Collation'] != 'utf8_unicode_ci'){
			$nu = ($r['Null'] == "YES")? "NULL" : "NOT NULL";
			$df = empty($r['Default'])? "" : "DEFAULT '".addslashes($r['Default'])."'";
			$sql = "ALTER TABLE `".$tab."` CHANGE `".$r['Field']."` `".$r['Field']."` ".$r['Type']." CHARACTER SET utf8 COLLATE utf8_unicode_ci ".$nu." ".$df;
			mysql_query($sql);
		}
	}
}

To set the MySQL default charset and collation to UTF-8, so the tables created later without any charset or collation specification would be default to UTF-8, please add these two lines to your my.cnf file under [mysqld],

character-set-server=utf8
default-collation=utf8_unicode_ci
Share and Enjoy:
  • Digg
  • Reddit
  • Design Float
  • Google Bookmarks
  • Yahoo! Bookmarks
  • del.icio.us
  • StumbleUpon
  • Technorati
  • BlinkList
  • Twitter
  • E-mail this story to a friend!
Written by Frank Liu in: Web Development | Tags: ,

1 Comment »

Trackbacks/Pingbacks

RSS feed for comments on this post. TrackBack URL


Leave a Reply

Copyright Orite Group Pty Ltd | Powered by WordPress | Theme: Aeros 2.0