Forum Replies Created

Viewing 14 replies - 1 through 14 (of 14 total)
  • Thread Starter mattdawsonuk

    (@mattdawsonuk)

    Hi @tararebeka.
    Check my last post on this thread for the fix. https://www.ads-software.com/support/topic/iso-matts-patches-to-extend-cleanphp/
    Hope that helps.
    Matt

    Hi Roger.

    I spotted earlier this week that Google have changed their bold styling to use ‘font-weight:700’ instead of ‘font-weight:bold’.

    So you just need to change ‘font-weight:bold’ to ‘font-weight:700‘ in my fix code and it should sort it.

    I’ll have a test of the tracking link removal code later in the week and get back to you on that.

    Hope that helps.

    Matt

    And incase this helps, here are some of the other changes i have made (inserted after the google tracking removal code):

    1. I needed support for tables so i have added to the allowed tags on line 74. Mine looks like this (note the addition of the table, td and th tags):

    $post_content = strip_tags($post_content, '<strong><b><i><em><a><u><br><p><ol><ul><li><h1><h2><h3><h4><h5><h6><table><tr><td><th>' );

    2. I was getting google classes pulling through in some elements so i also run this:

    //Remove classes in elements - mattd
    $post_content = preg_replace('/<p(.*?)>/', '<p>', $post_content);
    $post_content = preg_replace('/<h1(.*?)>/', '<h1>', $post_content);
    $post_content = preg_replace('/<h2(.*?)>/', '<h2>', $post_content);
    $post_content = preg_replace('/<h3(.*?)>/', '<h3>', $post_content);
    $post_content = preg_replace('/<h4(.*?)>/', '<h4>', $post_content);
    $post_content = preg_replace('/<h5(.*?)>/', '<h5>', $post_content);
    $post_content = preg_replace('/<h6(.*?)>/', '<h6>', $post_content);
    $post_content = preg_replace('/<li(.*?)>/', '<li>', $post_content);
    $post_content = preg_replace('/<ul(.*?)>/', '<ul>', $post_content);
    $post_content = preg_replace('/<a class=\"(.*?)\"/', '<a', $post_content);

    3. To remove any empty tags i run this:

    //Remove empty elements - mattd
    $post_content = preg_replace('/<a><\/a>/', '', $post_content);
    $post_content = preg_replace('/<h1><\/h1>/', '', $post_content);
    $post_content = preg_replace('/<h2><\/h2>/', '', $post_content);
    $post_content = preg_replace('/<h3><\/h3>/', '', $post_content);
    $post_content = preg_replace('/<h4><\/h4>/', '', $post_content);
    $post_content = preg_replace('/<h5><\/h5>/', '', $post_content);
    $post_content = preg_replace('/<div><\/div>/', '', $post_content);

    As you can see, it is easy to set up your own string replacements to make a change to the content during the google to wp conversion.

    To remove the google tracking from urls i have this added to extend-clean.php (Inserted at line 71)

    //remove google tracking links - mattd
    $post_content = str_replace('https://www.google.com/url?q=', '', $post_content);
    $post_content = str_replace('https://www.google.com/url?q=', '', $post_content);
    $post_content = preg_replace('/&sa=D&sntz=1&usg=(.*?)\">/', '">', $post_content);
    $post_content = preg_replace('/&sa=D&usg=(.*?)\">/', '">', $post_content);
    $post_content = preg_replace('/&sa=D&ust=(.*?)&usg=(.*?)\">/', '">', $post_content);
    $post_content = str_replace('%3A', ':', $post_content);
    $post_content = str_replace('%2F', '/', $post_content);
    $post_content = str_replace('%3F', '?', $post_content);
    $post_content = str_replace('%3D', '=', $post_content);

    I’m sure this can be reduced to a single replacement as i have done this in a python app that utilised the DriveAPI. However i don’t have time to change and test it. There are two lines that look similar because Google change the construct of their tracking links about 12 months ago.

    Hope this helps.

    Matt

    @piantadosi I’ve had a busy couple of days and forgot to send you my code! I’ll send it you in the morning.
    Matt

    Hi Roger

    Just got your message about the tracking links. You are in look! I was able to add in some regex replacements to remove the Google tracking. I’ll send over a code snippet when I’m back at my desk on Monday. I should also send in a pull request to get the bold/italic fix merged in with the core!

    Matt

    Thread Starter mattdawsonuk

    (@mattdawsonuk)

    Just adding a note to point to my latest post about a fix for the bold and italic styling. See my post here

    Just adding a note to point to my latest post about a fix for this. See my post here

    Whilst running the beta version for the last year i have been running a modified version of the ‘extend-clean.php’ add-on to strip the google tracking from hyperlinks, add support for tables and add some styling to certain elements. So i am reasonably familiar with how it works.

    My guess would be that the problem lies at line 16. Where it is generating an array of bold spans. I imagine that the regex needs tweaking but I don’t know enough regex to work out what.

    When you import a post with the extend-clean addon disabled then bold items come in with a span of <span class="c8">Bold Text Here</span> (note that that the digit isn’t always an 8). It looks to me as though this part isn’t needed: {(.*?)font-weight:bold(.*?)} but again i have only dabbled a little with regex.

    Maybe that helps someone. Or maybe not! ??

    Matt

    Hi piantadosi

    I am experiencing the same since updating to 1.1 as i too was caught out by google pulling the plug!

    Loosing the bold and italic styling is a big deal for me.

    Hopefully someone can find the solution soon.

    Matt

    Thread Starter mattdawsonuk

    (@mattdawsonuk)

    Breakthrough!! ?? – check your folder paths!

    I though i would give the origin and destination folder values a check and they looked odd! Along the lines of “https://drive.google.com/folderview?id=folderview?id=0AvbaiFDF9adfs8ALJDfadsf9JLKSDFjavadvasdf&usp=sharing”
    (note the double folderview?id=. I removed this and the id value below the field updated to the equivalent of: “0AvbaiFDF9adfs8ALJDfadsf9JLKSDFjavadvasdf”.

    After editing both folder fields and running the cron it now works!! ??

    Although i am left with an entry in the error log that looks like this after the import has finished: https://docs.google.com/feeds/download/documents/export/Export?id=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX_MsOtxY8&exportFormat=html (id replaced with X’s).

    Do i need to worry about this?

    Also, it seems that all Bold and Italic formatting is being stripped on import. Is there a reason for that? I notice this topic mentions it too: https://www.ads-software.com/support/topic/bolditalic-links-categories-authors-no-longer-transfer?replies=2.

    thanks all.

    Matt

    Thread Starter mattdawsonuk

    (@mattdawsonuk)

    Well, the frustration is building!

    So, I am now running on a local wordpress install so i have greater control and freedom to hack things about.

    I have deleted any google app projects i had been experimenting/learning with and created a fresh one. I have also re-installed the Docs to WP plugin and cleared the database entries for the ID and Secret. I have updated to WP 4.2.2. PHP is running at Version 5.5.3. and Docs to WP is V1.1.

    I am now further ahead than i was on Monday as i can now click on Settings > Docs to WP and the side bar and it returns the correct settings page with the status of ‘connected’. This is after i accepted the api permission page. I can now see the ‘auth token’ refresh when i reload the page and i see the requests appear in the graph on the overview page for the app in the Google developer console.

    However, the posts still aren’t pulling through! I was getting the error “Syntax error, malformed JSON.” in the Abstract_Api.php. Reading this topic: https://www.ads-software.com/support/topic/10-beta-questionsissues?replies=26 i have tried removing the following code:

    switch (json_last_error()) {
    
    			case JSON_ERROR_NONE:
    				$error = null; // JSON is valid
    				break;
    
    			case JSON_ERROR_DEPTH:
    				$error = 'Maximum stack depth exceeded.';
    				break;
    
    			case JSON_ERROR_STATE_MISMATCH:
    				$error = 'Underflow or the modes mismatch.';
    				break;
    
    			case JSON_ERROR_CTRL_CHAR:
    				$error = 'Unexpected control character found.';
    				break;
    
    			// only PHP 5.3+
    			case JSON_ERROR_UTF8:
    				$error = 'Malformed UTF-8 characters, possibly incorrectly encoded.';
    				break;
    
    			case JSON_ERROR_SYNTAX:
    				$error = 'Syntax error, malformed JSON.';
    				break;
    
    			default:
    				$error = 'Unknown JSON error occured.';
    				break;
    
    		} 
    
    		if( !empty( $error ) ) {
    
    			throw new JsonException($error);
    
    		}

    I now no longer get any errors but still no posts! grrr.

    Can anyone help?? :-/

    Thanks.

    Matt

    Thread Starter mattdawsonuk

    (@mattdawsonuk)

    Thanks. I’ll give that a try when I’m back at work on Thursday. And also make sure I’m signed in correctly.

    Matt

    Thread Starter mattdawsonuk

    (@mattdawsonuk)

    Thanks for your reply tararebeka.

    Yes I have enabled the api and set up the redirect url. When using the plugin I actually get to the Google page which asks for the api permission. I get the impression I wouldn’t get that far if the api wasn’t enabled in the developer console. Also, the plugin gets to ‘connected’.

    Do you know what should happen when I click the accept button as per the screen grab I linked to in my original post?

    Matt

Viewing 14 replies - 1 through 14 (of 14 total)