When we were building a Wordpress website for a Chinese community we found even the configuration in Wordpress defined default charset as 'UTF-8', becase the datbase tables/fields are by default not in 'utf8_unicode_ci', it won't allow users to post in Chinese the characters all come up as ? marks.
So we decide to alter all the tables and fields for Wordpress to UTF-8, which would fix the problem. However it's not easy to change all the fields one by one from one charset and collation to another. Then we have written the following php script to help converting the tables and all text fields to UTF-8.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
<?php //mysql database name $db_name = "my_test"; //db connection $dc = mysql_connect("localhost", "root", ""); mysql_select_db($db_name, $dc); //set database charset and collation $sql = "ALTER DATABASE `".$db_name."` DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci"; mysql_query($sql); //set tables charset and collation $sql = "SHOW TABLES"; $rs = mysql_query($sql); $tables = array(); while($r = mysql_fetch_array($rs)){ $sql = "ALTER TABLE `".$r[0]."` DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci"; mysql_query($sql); $tables[] = $r[0]; } //set table rows charset and collation foreach($tables as $tab){ $sql = "SHOW FULL COLUMNS FROM `".$tab."`"; $rs = mysql_query($sql); while($r = mysql_fetch_array($rs)){ if(!empty($r['Collation']) && $r['Collation'] != 'utf8_unicode_ci'){ $nu = ($r['Null'] == "YES")? "NULL" : "NOT NULL"; $df = empty($r['Default'])? "" : "DEFAULT '".addslashes($r['Default'])."'"; $sql = "ALTER TABLE `".$tab."` CHANGE `".$r['Field']."` `".$r['Field']."` ".$r['Type']." CHARACTER SET utf8 COLLATE utf8_unicode_ci ".$nu." ".$df; mysql_query($sql); } } }
To set the MySQL default charset and collation to UTF-8, so the tables created later without any charset or collation specification would be default to UTF-8, please add these two lines to your my.cnf file under [mysqld],
character-set-server=utf8 default-collation=utf8_unicode_ci











Hello, I like your approach!
But where to put exactly these code?
csaba