• Hi, We’re getting millions of spammy search param URL’s for our site picked up by Google that’s affecting our crawl budget. We’ve noindexed / 404’d most pages but still seeing errors. Any advice to clear these from GSC?

    We tried blocking the pages in robots.txt and also noindexing the search results pages but still appear to have quite a serious spam issue. We have also looked at blocking further bots through our CDN cloudflare and installed a spam blocking search plugin. Lastly we have disavowed any potentially toxic backlinks via GSC from SEMRush’s backlink audit report.

    It looks as though the Crawling Issue might be resolved now after we took some action through the following:
    Disavowing any potential spam links
    Put further search spam protection in place through the Relevassi plugin
    Put further restrictions on bots through Cloudflare & Our Host.
    Blocked all bots from scanning search results pages
    Blocking search pages in robots.txt
    Noindexing the search results pages

    It looks as through the spam was coming from 4 main websites:
    domain:879783.com
    domain:Bityard.com
    domain:k8vip99.cc
    domain:X1798.com

    Example URLs include:

    https://manofmany.com/search/Home/page/49&filter=popular%2Fpage%2F3%2Fpage%2F5%2Fpage%2F4%2Fpage%2F3%2Fpage%2F3%2Fpage%2F3%2Fpage%2F132%2Fpage%2F131%2Fpage%2F129%2Fpage%2F130%2Fpage%2F130%2Fpage%2F128%2Fpage%2F130%2Fpage%2F131%2Fpage%2F132%2Fpage%2F131%2Fpage%2F129%2Fpage%2F127%2Fpage%2F126%2Fpage%2F127%2Fpage%2F126%2Fpage%2F126%2Fpage%2F125%2Fpage%2F126%2Fpage%2F124%2Fpage%2F124%2Fpage%2F123%2Fpage%2F124%2Fpage%2F124%2Fpage%2F122%2Fpage%2F145%2Fpage%2F146%2Fpage%2F145%2Fpage%2F146%2Fpage%2F147%2Fpage%2F145%2Fpage%2F143%2Fpage%2F142%2Fpage%2F143%2Fpage%2F144%2Fpage%2F145%2Fpage%2F143%2Fpage%2F145%2Fpage%2F143%2Fpage%2F143%2Fpage%2F145%2Fpage%2F147/page/147
    
    https://manofmany.com/page/198?s=C?+vua+hy+v?ng【879783.com】N?n+t?ng+l?n+??+nh?n+phong+bì+??+m?i+tu?n】ph?i+t?i+v?+apple【879783.com】X?p+h?ng+tín+d?ng+cao+nh?t】ciu0f22z0
    
    https://manofmany.com/page/189?s=X?+s?+bóng+?á+4+seri+3+trong+3+trò+ch?i+cách+tính+ti?n【?i+vào+link∶879783.com】Sòng+b?c+tr?c+tuy?n+??u+tiên+trên+th?+gi?i】Phiên+b?n+máy+tính+b?c+th?y+cau+cá【879783.com】N?n+t?ng+trò+ch?i+???c+ch?+??nh+cho+World+Cup】p6zsj21wk
    
    https://manofmany.com/page/164?s=bitcoin+price+graph|Bityard.com

    Any further advice would be greatly appreciated.

    Kind regards,
    Scott

    • This topic was modified 10 months, 3 weeks ago by Yui. Reason: put links into code block
Viewing 1 replies (of 1 total)
  • Plugin Author Mikko Saari

    (@msaari)

    First, this isn’t really the right place to ask. I can’t provide Google Search Console support; I’m not Google, and I don’t know much about GSC.

    Second, these searches have nothing to do with Relevanssi. You would be getting this spam traffic, Relevanssi or not. I don’t think it’s terribly dangerous either because just about everybody’s getting this, and Google should be able to tell that this is spam traffic everybody’s getting.

    That said, there are things you can do. If the spam is coming from specific sources (a small range of IP addresses), you can block those IP addresses on the server level so that these bots won’t even boot up WordPress. That would be the best solution.

    Relevanssi Premium has a spam block feature where you can set up keywords that will stop these spam queries from running and instead return a 410 Gone server response. This 410 response will eventually tell Google that these search results do not exist.

    Here’s how you can implement keyword-based spam blocking without Relevanssi Premium.

    You can also set up robots.txt instructions to tell Googlebot not to crawl your search results pages at all. That would be another solution to clear these from your Google indexing.

Viewing 1 replies (of 1 total)
  • The topic ‘Millions of Spammy URLs in GSC’ is closed to new replies.