conflicting character encoding, wordpress vs mysql
-
Back in December last year I set up a bilingual English/Korean blog. At the time I was in a real hurry, and since I had some problems with UTF-8, I resorted to setting the WordPress character encoding to EUC-KR and everything seemed to work fine.
Now, with WordPress 2.0.3, I’ve found that UTF-8 works well with Korean input (I tried it on a fresh installation), so I’d really like to move my existing blog over to UTF-8. However, I’m having some real problems.
First I tried this technique: to SSH into my server, use mysqldump to create an SQL file and then run iconv to convert EUC-KR to UTF-8. But iconv simply refused to convert the file, truncating the output as soon as it reached a korean character. Next, I FTPed the SQL file onto a Windows machine and tried to open it with various editors (including notepad++, and visual studio) so that I could save it again as UTF-8. Then I realised that the SQL file I had exported was already UTF-8…
In fact, I have discovered that my MYSQL server doesn’t have any Korean-specific character sets or collations installed. So, mysql is recognising everything as UTF-8, but my wordpress character set is EUC-KR! I’m surprised that it’s been working at all.
But how do I convert this across???
I even tried writing some PHP code to pull some Korean text out of the database, translate it, and put it back. Here’s the experimental code that I wrote. It operates on a database table created by one of my WordPress plugins:
function perform_maintainence() {
global $wpdb;
$sql = "SELECT * FROM $this->table_name;";
$rows = $wpdb->get_results($sql);
foreach($rows as $row) {
$utf8name = iconv("EUC-KR", "UTF-8", $row->translated_name);
$utf8name = $wpdb->escape($utf8name);
$sql = "UPDATE $this->table_name SET translated_name='$utf8name' WHERE cat
_name='$row->cat_name' AND locale='$row->locale';";
$wpdb->query($sql);
}
}
The results are patchy. After the conversion, half of the korean syllables do appear correctly, but the others remain scrambled. And this just makes me even more confused.
What makes things harder, is that I can’t view Korean syllables when I SSH in with Putty, and phpMyAdmin also mangles the korean text.
Any advice would be greatly appreciated!!!
- The topic ‘conflicting character encoding, wordpress vs mysql’ is closed to new replies.