When analyzing robots.txt with
https://pagedart.com/tools/robots-txt-file-checker/#unknown-directive/
it tells me Line:1 has an error.
that would be your comment
any chance you can remove it?
#This virtual robots.txt file was created by the Virtual Robots.txt WordPress plugin: https://www.www.ads-software.com/plugins/pc-robotstxt/
on save dont show on robots.txt
Disallow: /*/amp
Disallow: /*/amp/
With this plugin installed and enabled, Lighthouse still reports that it was not able to d/l robots.txt.
]]>Hi Marios,
how about a next level integration.
Why not have a button to update the robots.txt file.
It would need a static section and a badbots sections.
Take a look here
https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/blob/master/robots.txt/robots.txt
or the main page
https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker
your Virtual Robots.txt file should have an update button to update the badbot section.
the static section I have entries like the following:
User-agent: *
Crawl-delay: 3
Disallow: /lists/
Disallow: /wprm_print/
Disallow: /search/
Disallow: /trackback/
Disallow: /page/*/$
Disallow: /wp-login.php
Disallow: /wp-register.php
Disallow: /it/wprm_print/
…
then another botton to check your robots.txt file
for example
https://pagedart.com/tools/robots-txt-file-checker/
Nice plugin.
I’d like to mention a fault.
When option blog_public is enabled (admin :: reading :: search engine visibility), the plugin will not handle the robots.txt rewrite rule.
Is there a reason behind this behavior?
According to this post
https://stackoverflow.com/a/18316292
and the comment section below, there could be circumstances where both robots meta tag and robots.txt file should be present to disable both crawling and indexing.
Hi since I have started using robot.txt all my Cron Jobs have failed. Is there a line I need to add to allow cron jobs?
]]>Hi, I have a website with images. I would like my clients to only view the images when they are in my website. However, when i search for my company’s name on google search engine (under images) there are all the images shown. I want to remove all images on google search engine that is linked with my website. I have entered the following under settings:
User-agent: Googlebot-Image
Disallow: /
However, the content (images) are still being appeared on google search engine. Please advise if there is another way of resolving this. Thank you.
]]>Hello! ??
Does this software work on WP Multisite and is compatible with it?
Thanks!
]]>I wanted to suggest modifying the plugin with these features. I am not a pro, so if you belive this is wrong, I would apreciate learning about it. It’s just that I believe the plugin would be safer if you update it to:
-insert a blank index.php file in the plugin folder.
-insert defined(‘ABSPATH’) or die(‘Cannot access pages directly.’);
-modify the line: update_option( ‘pc_robotstxt’, $options );
to include sanitation:
sanitize_text_field(update_option( ‘pc_robotstxt’, $options ));
Big respect for your work, I really like the plugin.
Cheers
Hey there,
It took a while to notice, but after some investigating, it appears when new sites are created in a multisite, those sites throw a 404 error for robots.txt when you try to access it when it didn’t before the last update. Sites launched before latest update are unaffected and working properly. They are also not affected if you make changes and update the robots.txt data.
I believe the changelog for that update indicated this was a fix for PHP7, and we have been running PHP7 this entire time and never had issues.
We already know it does not work while it has (what we consider) the “development” path of the site (networkdomain.com/site).
This problem has been occurring after we launch the site and map the domain for that site, including changing the site_url via DB and do a search and replace in the DB tables for what the launched url will be, etc. (e.g. from networkdomain.com/site to site.com)
Once we launched the last couple sites to their respective domains and try site.com/robots.txt it will take you to the 404 page.
Could you assist with this please?
Server Details:
Apache 2.4.29
PHP 7.1.21
WP 4.9.8
Plugin Version 1.9
Additional details:
Applicable DB records in options table mentioning plugin:
_transient_plugin_slugs: "pc-robotstxt/pc-robotstxt.php"
pc_robotstxt: `a:2:{s:11:”user_agents”;s:272:”User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /wp-includes/
Allow: /wp-includes/js/
Allow: /wp-includes/images/
Disallow: /trackback/
Disallow: /wp-login.php
Disallow: /wp-register.php
Sitemap: https://(omitted)/sitemap.xml”;s:15:”remove_settings”;b:0;}`
On sites where it is working, there is record for robots.txt in the row called rewrite_rules that has: "robots\.txt$";s:18:"index.php?robots=1";
but this does not exist in any of the sites currently not working. I strongly believe this is the problem since index.php?robots=1 will show what is supposed to be robots.txt.
I created a 301 redirect for this on engtech.services which is why on that site if you do /robots.txt you are taken to (and can see in URL) /index.php?robots=1 which is not ideal and need it to be /robots.txt
Websites currently experiencing this issue (that were launched after the latest update):
https://townofdish.com
https://engtech.services
Examples of a couple Websites on the same multisite that were launched pre-update that are working:
https://cityofmaypearl.org
https://www.texassteakcookoff.com
There are no errors in Apache error logs.
Access log shows status code returned as 404 for robots.txt
There are no errors in PHP error logs.
We have the same process that we follow each time we launch a site and has not changed in over a year now. Thanks for looking into this!
]]>I entered the following into the robots.txt file:
User-agent: *
Disallow:
And the search engine still tells me “We would like to show you a description here but the site won’t allow us.”
I checked the General Settings and did not see the box that is blocking the search engines from indexing my site.
What else can I do to make metadata visible?
]]>Dear Marios,
Generally speaking, I like this plugin. However, in the past few days, Google AdSense has been throwing tons of errors at me saying that sitemap.html is marked “noindex” in the robots.txt file created by the plugin. I cannot find any such limitation in the file, nor any setting that would block the sitemap from being followed by the bot.
AdSense is also saying that robots.txt is blocking them from dozens of URLs that in fact, do not exist and never have.
Any ideas? I have deactivated the plugin temporarily.
]]>Despite checking the option, “When you tick this box all saved settings will be deleted when you deactivate this plugin”, the settings remained after deactivating the plugin. Even after uninstalling the plugin, they remained. Installing a different robots.txt editor didn’t help either.
What can I do to fix this?
]]>When I try to view the robots.txt file that’s being created through this plugin, it displays an empty version of the XML sitemap that’s created through the Yoast SEO plugin.
Does anyone know what could be causing this and how to fix it?
]]>I have a huge problem for some time, a bad plugin put me a virtual robots.txt;
User-agent: *
Crawl-delay: 10
and I can not destroy this.
I had put a physical robots.txt, but the result is nil because Google continues not to reference me.
I just try your plugin, (another before) and nothing changes. I register your
but my files still indicate
User-agent: *
Crawl-delay: 10
what can i do more ?
]]>After saving the settings, I click on preview robots.txt and get a 404 page not found error. I do not have a physical robots.txt in place. The URL in the preview window is https://domain/robots.txt
Can you help please?
]]>extralargemarketing.com ?? thanks
]]>Hi!
The description says “By default, the Virtual Robots.txt plugin has a bunch of spam-bots disallowed, the Google bots specifically allowed, ” Can you tell me where this is set?
Also, I am curious about some of the default settings, Im not sure why one would want robots to crawl directories like “Allow: /wp-includes/js/” or Allow: /wp-admin/admin-ajax.php
Thank you!
]]>Hi Support
I just submitted a sitemap for a new site to Google and got this warning
“Sitemap contains urls which are blocked by robots.txt.”
(27 of them)
I haven’t touched my robots.txt file at all. Do you have any suggestions? Could your plugin help?
My sitemap is created by All in One SEO.
Many Thanks
]]>Apparently my comment was delete on your website so I come here.
I have download your plugin and since I can’t rewrite my robots.txt. Your plugin doesn’t work, I try to edit but that doesn’t work. No problem I remove your plugin BUT robots.txt stayed. I searched EVERYWHERE, I can’t edit this robot.txt. I tried to install others robots.txt but your plugin override everytime. I tried to add a physical file and again your plugin override. I tried to desactivate plugins, always the same robots.txt. I don’t know what is going on, but I searched since 4 hours.
Please tell me what I need to do.
]]>Not work with nginx and php7.0-fpm, get only this text:
404 Not Found
nginx
But Permalinks is working normally.
]]>Hello.
Thank you for your plugin development.
While using the PHP Compatibility Checker plugin from the WordPress repository, it generated the following two errors when tested against PHP 7:
FILE: /plugins/pc-robotstxt/pc-robotstxt.php
———————————————————————————–
FOUND 1 ERROR AFFECTING 1 LINE
———————————————————————————–
35 | ERROR | Deprecated PHP4 style constructor are not supported since PHP7
———————————————————————————–
FILE: /plugins/pc-robotstxt/admin.php
—————————————————————————-
FOUND 1 ERROR AFFECTING 1 LINE
—————————————————————————-
4 | ERROR | Deprecated PHP4 style constructor are not supported since PHP7
—————————————————————————-
Sure you know of this, but for others…whilst researching topic…
PHP Remove Constructors page.
Curt
]]>When I edit the plugin file by adding or removing lines
(ie: Disallow: /about-us/ ) or other addition or deletion
and then I Preview file…. nothing is changed in the preview.
The plugin doesn’t appear to write to the file.
Any help appreciated
Bob P
]]>I have other plugins that add entries to my robots file but they disappear when I activate your plugin.
]]>Something in the fix for removing line breaks is removing the blank lines that were in my robots.txt file. This did not occur after updating to plugin version 1.6; only after updating to 1.7.
]]>I believe that the stripslashes
function applied to the options, inside the textarea field (in admin.php) and then possibly in the do_robots
function (in pc-robotstxt.php) may be causing the error where the backslash for line endings,\n
, are being stripped.
The end result is a problematic one-liner that must be separated, in the textarea box by hand. Many WordPress users not adept at this, because they don’t understand the purpose of the code in the first place. It’s easy to unwind this improperly, even with instructions.
Then they get errors from Google and are upset at my recommendation of this otherwise terrific plugin. I love it, other than for this problem.
Can you please fix it? Thank you!
–Fran
]]>Hello,
On lines 102 and 104 of pc-robotstxt.php you hardcode the https:// protocol for an auto-discovered sitemap file in the root of the website. However, people are more and more using https:// protocol for security and privacy.
Please check for is_ssl or use the home_url or something that is protocol-sensitive.
]]>When I click on “preview your robots.txt file here” a new window does open, but I get a “The webpage cannot be found” message.
Searching through my files on my webhost, I do not find a robots.txt file.
Neither can I view a file using my url: https://www.stbedesantafe.org/robots.txt
My theme is Preferential Lite.
]]>Thank you so much for the plugin, we would like to see an update for the new WordPress versions if you have some time ??
Thanks,
Imad
]]>I wanted to add this plugin but it doesn’t show up at all. I even entered PC Robots.txt to locate it and it still did not show up. Is this plugin no longer available? I tried downloading it and was not successful with the upload.
]]>