• barnez

    (@pidengmor)


    I have experienced this behaviour on two sites on different servers using different themes. Basically, after restoring the database the content is fully restored but special characters such as apostrophes and currency symbols are replaced or added to with one more symbols (see underlined examples here, where the original was an apostrophe and £ sign). Clearing the cache doesn’t make any difference.

    This is a real pain, and means that restoring becomes a time-consuming process as every site page needs to be reviewed and edited. It can’t be the backup plugin, as previously I had the same issue when I asked the host to restore from one of their backups. Is this a database or utf problem? Or could it be something to do with the super cache plugin that is installed on both sites?

Viewing 14 replies - 1 through 14 (of 14 total)
  • AITpro

    (@aitpro)

    Looks like a classic UTF/Unicode problem. You could try doing something in this general ballpark area. This is not intended as a finished solution and only a hint in the right direction. Typically adding this to a Query is only necessary when you are doing something like connecting to another site and updating/inserting data into another site’s DB.

    https://php.net/manual/en/mysqli.set-charset.php

    Sets the default character set to be used when sending data from and to the database server.

    $mysqli->set_charset("utf8");

    This ASCII printable characters “filter” is useful for doing Queries across different site DB’s.
    $example = preg_replace( '/[^\x01-\x7F]/', "", $this->something['example'] );

    You can try altering DB Table Collations, but typically that is “after the fact” since the problem usually occurs during the Query (data input). WP uses these charsets|Collations: utf8 | utf8_general_ci and utf8mb4 | utf8mb4_unicode_ci.

    https://mathiasbynens.be/notes/mysql-utf8mb4

    AITpro

    (@aitpro)

    And all of the above stuff is assuming that characters are literally being replaced or inserted in your database tables and not that what you are seeing is output. If it is output then you need to look at your Headers to make sure they are correct. Should be: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

    Thread Starter barnez

    (@pidengmor)

    Thanks for your insight AITpro. It’s much appreciated and I’ll look into your suggestions.

    For what it’s worth both of these sites were set up with
    define('DB_CHARSET', 'utf8'); in the wp-config, earlier backups were fine and both are outputting <head><meta charset="UTF-8"/> in the page source.

    AITpro

    (@aitpro)

    So this is definitely a data input problem then correct? When you restore a database from an sql dump file, MySQL INSERT is used to input data into your DB Tables. So if the data is correct in the sql dump file and even in some cases when you view the DB table data with phpMyAdmin then hidden characters may not be visible to the eye. If you want to check both the sql dump file data and the data in DB Tables you can use Notepad++, which will show you hidden characters that are not visible normally.

    AITpro

    (@aitpro)

    This is obvious, but may not be obvious to everyone. You would have to copy data into a new Notepad++ document and then view that data. If there are hidden characters then they will be displayed in the new Notepad++ doc.

    Thread Starter barnez

    (@pidengmor)

    That was a very useful suggestion. If I check the sql dump in Notepad++ the currency symbols ($ £) appear fine but every apostrophe is preceded by a backslash. So:

    Client’s failure on the website become Client\’s failure in the sql file.

    I don’t believe they are becomes I don\’t believe they are in the sql file.

    Is this normal behaviour?

    Thread Starter barnez

    (@pidengmor)

    AITpro

    (@aitpro)

    Yes, it is normal/standard to escape single and double quote characters with backslashes in an sql dump file. So now what you want to do is use phpMyAdmin or install the Adminer plugin (one of my favorite plugins) and go to the database table where the data is not correct and copy that data (value) to Notepad++ and check it.

    AITpro

    (@aitpro)

    I cannot see the characters that appear in place of the apostrophe after the restore.

    Probably because they are hidden formatting characters that are not visible to the naked eye, unless you copy that data to something like Notepad++ that shows hidden characters.

    Thread Starter barnez

    (@pidengmor)

    Yes, it is normal/standard to escape single and double quote characters with backslashes in an sql dump file.

    Understood.

    So, I tried accessing the database through Addminer plugin, located sections of data (strings of text) that were corrupted on the restore and copied that into the Notepad++. The result is text that contains correct and uncorrupted apostrophes and currency symbols.

    Have I followed the correct process?

    AITpro

    (@aitpro)

    Ok you have eliminated that it is an input problem, which means it is an output problem and those are much easier to fix since you do not have to muck around in someone else’s code and do mods. For some reason your Headers are not outputting HTML as UTF-8. And a likely suspect is WP Super Cache since it does do things with Headers. You can now start doing all of the standard WP troubleshooting steps of eliminating plugins one by one to find out where the problem is. And you also need to switch your theme since themes do things with Headers. The good news is there is nothing wrong with your database or whatever DB backup plugin you are using or with importing DB backups directly from your host.

    Thread Starter barnez

    (@pidengmor)

    Excellent. It’s a relief to hear that the issue is plugin or theme related rather than the database or backup plugin. I am now on much more familiar territory with troubleshooting. I also suspect SuperCache and in the last 6 months I’ve started to use the preload function which was not the case previously and could be relevant. As these sites are live, I really need to now set up test sites before I start restoring and testing.

    Your help here has been invaluable, and given me fresh motivation to resolve this. Thanks again, and I’ll report back on the findings.

    AITpro

    (@aitpro)

    Yep, the symptoms warranted “going deep” first to eliminate a major problem with your DB. ?? So yeah this is just going to be some nuisance frontend thing. You can do do a theme switch pretty quickly – 1 minute or less. ie switch from your current theme to a WP theme, test and switch back. Then the next thing I would do is flush all WP Super Cache and maybe change a couple of likely settings and test. Then eliminate your own Browser by deleting your Browser cache. Check your site from another Browser application. Check your site from a Proxy. And for a testing site you probably want to clone your site and then start reverse engineering it. ??

    AITpro

    (@aitpro)

    Oh and you can install Firefox and these add-ons: Firebug, FirePHP and YSlow to check your Headers for Content-Type: text/html; charset=UTF-8

    HTTP/1.1 200 OK
    Date: Fri, 13 Nov 2015 00:25:56 GMT
    Server: Apache
    X-Pingback: https://www.ait-pro.com/aitpro-blog/xmlrpc.php
    X-Frame-Options: SAMEORIGIN, SAMEORIGIN
    Vary: Accept-Encoding,User-Agent
    Content-Encoding: gzip
    Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform
    Expires: Sun, 15 Nov 2015 00:25:56 GMT
    X-Content-Type-Options: nosniff
    Pragma: no-cache
    Content-Length: 11419
    Keep-Alive: timeout=5, max=100
    Connection: Keep-Alive
    Content-Type: text/html; charset=UTF-8
Viewing 14 replies - 1 through 14 (of 14 total)
  • The topic ‘Corrupted formatting of certain characters after restore of database’ is closed to new replies.