• Resolved turbodb

    (@turbodb)


    I’m using the cloud version of BLC (recently switched from local), and I’m getting an error message:

    Scan aborted: Too many server errors.?This is to prevent overloading your server. Please retry the scan in a while or?contact support?if the issue persists.

    I know that my links do not return a 500 error on my server (most of my links redirect), so I’d like to drill down into a bit more detail on why/where the 500 is getting returned. Is there a way to get that additional information?

    Alternatively, is there a way to see how where in the process the scan was aborted? My site has 91223 Total Links and 1054 Unique URLs. I’m not sure what those actual labels mean, but I’d like to understand how many of them were scanned before BLC found the “35” broken links and aborted the scan.

    The page I need help with: [log in to see the link]

Viewing 10 replies - 1 through 10 (of 10 total)
  • Plugin Support Laura – WPMU DEV Support

    (@wpmudevsupport3)

    Hi @turbodb,

    Hope this message finds you well.

    I know that my links do not return a 500 error on my server (most of my links redirect), so I’d like to drill down into a bit more detail on why/where the 500 is getting returned. Is there a way to get that additional information?

    Well, indeed, it should not be detected as a 500 error, since is redirecting, as other links should return a status 403 since they redirect to Amazon. I can’t confirm but this might be due to your server firewall. Still, I notified our BLC team, and they might provide further information.

    Alternatively, is there a way to see how where in the process the scan was aborted? My site has 91223 Total Links and 1054 Unique URLs. I’m not sure what those actual labels mean, but I’d like to understand how many of them were scanned before BLC found the “35” broken links and aborted the scan.

    Unique URLs are the ones found on your site only.

    You will find more information about it on our documentation at this link https://wpmudev.com/docs/wpmu-dev-plugins/broken-link-checker/#broken-link-summary, it also includes the Error Codes.

    Since our BLC team works on very complex issues, getting a reply from them could take more time than usual. We will back to this topic once we get an update from them.

    Best regards,
    Laura

    Thread Starter turbodb

    (@turbodb)

    Thanks Laura @wpmudevsupport3.

    I’ve exported the CSV of the report from my wpmudev hub, in case it can be helpful to figure out what’s going on. It is available below.

    All of the links in that report are currently resolving for me, so I don’t know why they are showing up as errors in BLC. It’s got me seriously considering either another tool, or going back to the local scanner (which has other issues, but at least seems to get through all the links instead of aborting due to errors).

    broken-links-adventuretaco.com-2024-05-28-070719.csv

    Plugin Support Laura – WPMU DEV Support

    (@wpmudevsupport3)

    Hi @turbodb,

    We got feedback from our BLC team, they performed a few tests and confirmed what I mentioned in my previous reply, it seems you are using CloudFront and might be blocking our BLC bot, in such cases, you might need to whitelist our UA, and our IPs, you will find them on this link: https://wpmudev.com/docs/wpmu-dev-plugins/broken-link-checker/#broken-link-checker-user-agent.

    Additionally, they shared the results in JSON format, you can take a look over them on this link too https://drive.google.com/file/d/1V7_1bXrkAl3WMG4XylYyoB4v6CzPlcUm/view?usp=sharing

    Kindly whitelist our User Agent and IPs, run a new scan, and let us know the results.

    Best regards,
    Laura

    Thread Starter turbodb

    (@turbodb)

    Hi Laura,

    I am not using CloudFront at all, my site is hosted on a $5/mo Amazon Lightsail instance. A single box, with a public IP address, and a bitnami stack.

    What tests were performed to determined I am using CloudFront? (I suppose Amazon could be using it “for free” without my knowledge, but I doubt that would be the case, as they aren’t in the habit of giving away those types of services).

    Thanks,
    Dan

    Plugin Support Zafer – WPMU DEV Support

    (@wpmudevsupport15)

    Hi @turbodb,

    I hope you are doing well today!

    According to the cURL results the site is using CloudFront and our BLC team noticed that our User Agent is being blocked by it.

    You can confirm this by performing the following;

    curl -IL https://adventuretaco.com/go/if-everybody-did-jo-anne-stover/

    Above command will throw 503 error and once the UA is changed like;

    curl -IL -A "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/81.0" https://adventuretaco.com/go/if-everybody-did-jo-anne-stover/

    This will result with 200, so you should unblock UA of BLC
    https://wpmudev.com/docs/wpmu-dev-plugins/broken-link-checker/#broken-link-checker-user-agent

    Kind regards,
    Zafer

    Thread Starter turbodb

    (@turbodb)

    Thanks Zafer @wpmudevsupport15,

    I think what’s going on here is that, when you run the curl command described, a redirect occurs on my site (adventuretaco.com) and then CloudFront is being used at the redirect destination (amazon.com).

    This means – I think – that my site, running on an AWS Lightsail instance, does not use Cloudfront, but that BLC running into HTTP 500 errors after the redirects because amazon is doing something to block the bot traffic.

    See below for the trace

    curl -IL https://adventuretaco.com/go/if-everybody-did-jo-anne-stover/
    HTTP/2 302 
    x-robots-tag: noindex, nofollow
    x-redirect-by: WordPress
    location: https://amzn.to/3tvpTdU
    cache-control: max-age=0
    expires: Tue, 04 Jun 2024 00:16:01 GMT
    vary: Accept-Encoding
    content-type: text/html; charset=UTF-8
    date: Tue, 04 Jun 2024 00:16:01 GMT
    server: Apache
    
    HTTP/2 301 
    cache-control: private, max-age=90
    content-security-policy: referrer always;
    content-type: text/html; charset=utf-8
    date: Tue, 04 Jun 2024 00:16:02 GMT
    location: https://www.amazon.com/If-Everybody-Did-Ann-Stover/dp/0890844879?crid=29R42WY6XBCF3&dchild=1&keywords=everybody+did&qid=1612561014&sprefix=everybody+did,aps,364&sr=8-2&linkCode=sl1&tag=srchamzn-20&linkId=ab58ffd173d8b3808625c3ed5343cf51&language=en_US&ref_=as_li_ss_tl
    referrer-policy: unsafe-url
    server: nginx
    set-cookie: _bit=o540g2-c78698ce62c20457a9-00t; Domain=amzn.to; Expires=Sun, 01 Dec 2024 00:16:02 GMT
    strict-transport-security: max-age=1209600
    content-length: 395
    
    HTTP/2 503 
    content-type: text/html
    date: Tue, 04 Jun 2024 00:16:02 GMT
    server: Server
    accept-ranges: bytes
    x-amz-rid: 1NXFP2E2NDKH7SRW08J9
    vary: Content-Type,Accept-Encoding,User-Agent
    etag: "a6f-6187f291ddc80"
    strict-transport-security: max-age=47474747; includeSubDomains; preload
    last-modified: Wed, 15 May 2024 14:44:50 GMT
    x-cache: Error from cloudfront
    via: 1.1 646b6f21a2659c68f7a3822d035b97d2.cloudfront.net (CloudFront)
    x-amz-cf-pop: NRT57-C2
    alt-svc: h3=":443"; ma=86400
    x-amz-cf-id: Ga9AudCUHmBDKRawpBRiIak02Bq7xlhXTqcZSsrJNS4ETzZArurkeg==

    It seems a little strange to me that BLC wouldn’t work for amazon links, as it seems like *a lot* of the links that people would want to check for blogs would be their affiliate links to amazon.

    Given all this, I have two questions:

    1. Is it expected behavior that BLC doesn’t work with links to amazon?
    2. It seems to me that BLC should not stop processing external links if a redirect has occurred prior to receiving an HTTP 500, or if the redirect is no longer on the same domain as the original external link. Would it be possible to change BLCs behavior to continue processing these types of links?

    Thanks!

    Plugin Support Saurabh – WPMU DEV Support

    (@wpmudev-support7)

    Hello @turbodb

    Hope you’re doing well.

    Thank you for your observations, indeed it looks like Cloudfront could be involved at the amazon.com end. However, I was able to make some additional tests using Postman and the results were a bit different, I am confirming more about this with our BLC team and have already shared my findings with them.

    Also about the BLC to stop processing external links, I am checking about this with the BLC team if something like that could be possible.

    We will share an update here as soon as we receive further insights on those points from the BLC team.

    Further regarding the BLC reporting Amazon links, we did have some reports about the affiliate links reporting HTTP 5XX error from scanning at the Amazon end – this happens when we make multiple requests, hence we already skipped some links https://wpmudev.com/docs/wpmu-dev-plugins/broken-link-checker/#scanned-skipped, and the developers are planning to include the amazon links to the list.

    However, one of the reasons for the new engine is that we have a specific bot and we can contact those providers to allow it but we don’t have any ETA or guarantee they would allow the scan.

    Kind Regards,
    Saurabh

    Plugin Support Nithin – WPMU DEV Support

    (@wpmudevsupport11)

    Hi @turbodb,

    As stated above we have already raised this further with our team’s attention to check if any further improvements could be implemented down the roadmap.

    Since our team will be exploring features to improve the workflow regarding this, I’ll go ahead and mark it as resolved for now.

    However, for any new feature updates, you can get updates on our progress by subscribing to our roadmap at https://wpmudev.com/roadmap/.

    Kind Regards,

    Nithin

    Thread Starter turbodb

    (@turbodb)

    hi Nithin,

    before you mark this as resolved, there were two issues that were going to be followed up on

    However, I was able to make some additional tests using Postman and the results were a bit different, I am confirming more about this with our BLC team and have already shared my findings with them. 

    Also about the BLC to stop processing external links, I am checking about this with the BLC team if something like that could be possible.

    Plugin Support Nebu John – WPMU DEV Support

    (@wpmudevsupport14)

    Hi @turbodb,

    I have checked and confirmed that all the findings from our investigation have been brought to the attention of our developers for further review and improvement. Our developers are actively looking into this and further updates will be included in our roadmap as we have mentioned in our above response.

    Kind Regards,
    Nebu John

Viewing 10 replies - 1 through 10 (of 10 total)
  • You must be logged in to reply to this topic.