FF16 back to their Origin length.Claim Your Free 999 Pesos Bonus Today

nwevhajkan
(@nwevhajkan)

10 months ago
Hello,

Two questions about the pages/URLs that WordPress auto-generates, for example image files (/image-gallery/ or /training-banner-image-3/), blog categories (/category/training/), icons used in the site (/trainer/icon/), etc.
1. Where are all these pages/URLs in my Dashboard? They’re definitely not in Pages/All Pages.
2. Can I have Google Search Console stop crawling them by adding noindex to them? (I realize this is sorta an SEO question, but it’s basic so please forgive me!)
Basically, I don’t want these types of pages to show up in my GSC under indexed pages (I don’t want them indexed). Main question, how do I alter their HTML (where are they in WP)?

Thank you!

~Don Gorr

Viewing 3 replies - 1 through 3 (of 3 total)

clayp
(@clayp)

10 months ago
1. Your media files (/image-gallery/) are uploaded to the /wp-content/uploads/folder. (in your control panel)
2. If you have access to the server config files, I suggest you also add X-Robots-Tag, so the images wouldn’t be indexed even if hot-linked, e.g.
Stop crawling images using the WordPress dashboard.
1. Log into your WordPress dashboard.
2. Go to the media library.
3. Find the image you want to prevent from being indexed.
4. Search for the “Visibility” option in the right sidebar.
5. Set the visibility option to hidden or private.
Stop crawling images using robots.txt rules

User-agent: Googlebot-Image
Disallow: /images/ex1.jpg

To exclude multiple images from being indexed on your site, introduce a disallow rule for each image. Alternatively, if the images follow a common pattern, such as sharing a suffix in the filename, employ the * character in the filename. For example:

User-agent: Googlebot-Image
# Repeated ‘disallow’ rules for each image:

Disallow: /images/ex1.jpg
Disallow: /images/ex2.jpg
Disallow: /images/ex3.jpg

# Wildcard character in the filename for
# images that share a common suffix:
Disallow: /images/pictures-*.jpg

Stop crawling images using the server’s configuration file

To include the X-Robots-Tag in a website’s HTTP responses, modify the configuration files of your site’s web server software. For instance, on Apache-based web servers, you can utilize the .htaccess and httpd.conf files. Incorporating an X-Robots tag in HTTP responses offers the advantage of defining crawling rules that apply universally across a site. The use of regular expressions provides a high level of flexibility.

You can use the X-Robots-Tag for non-HTML files, such as image files, where implementing robots meta tags in HTML is not feasible. Here’s an instance of incorporating a noindex X-Robots-Tag rule for image files (.png, .jpeg, .jpg, .gif) throughout an entire site:

Apache:
<Files ~ “\.(png|jpe?g|gif)$”>
Header set X-Robots-Tag “noindex”
</Files>

Nginx:
location ~* \.(png|jpe?g|gif)$ {
add_header X-Robots-Tag “noindex”;
}
Thread Starter nwevhajkan
(@nwevhajkan)

10 months ago

Thank you Clay.

savan23
(@savan23)

9 months, 3 weeks ago

I have facing same issue

We have encountered a search console error?“Blocked due to access forbidden (403)” in?jaymehta.co.

See the screenshot https://i.imgur.com/p7Zcwf1.png & https://i.imgur.com/zcYKGhm.png

Autogenerate URL likes

https://www.jaymehta.co/blog/ahrefs-vs-majestic-vs-moz/https://www.jaymehta.co
https://www.jaymehta.co/blog/jay-mehta-upcity-leading-b2b-service-provider/https://www.jaymehta.co
https://www.jaymehta.co/blog/ideas-for-marketing-your-coaching-business-online/https://www.jaymehta.co

<b>We need to stop these URLs from Crawling.</b>

Viewing 3 replies - 1 through 3 (of 3 total)

The topic ‘Stop auto-generated URLs from being crawled’ is closed to new replies.

Stop auto-generated URLs from being crawled

Tags

Topics

Topics with no replies

Non-support topics

Resolved topics

Unresolved topics

All topics