How to Clean/Remove Not Found Errors from Google web master tools generated from translated versions
Categories: Troubleshooting
I installed a translator plugin on one of my WordPress blogs but the plugin wasn’t working properly so I disabled it but two days later I found out that my Google web master tools account was reporting about 1100 ‘Not Found’ errors under the ‘Web crawl errors’ section. All the errors were from translated versions of my blog. I used the ‘robots.txt’ file to fix this issue.
If you don’t know what a ‘robots.txt’ file is, then read the article titled how to control access of the web crawlers or web robots to your site.
Basically, add rules to your ‘robots.txt’ file to Disallow any spider from indexing the translated version of the pages. My ‘robots.txt’ file looks like the following Depending on your situation you might need to block more languages. Just look in the Google webmaster tools and see which languages are causing the error then add them to the Disallow rule.
User-Agent: * # Language pages Disallow: /ar/* Disallow: /bg/* Disallow: /zh-hant/* Disallow: /ca/* Disallow: /cs/* Disallow: /da/* Disallow: /de/* Disallow: /el/* Disallow: /es/* Disallow: /fi/* Disallow: /fr/* Disallow: /he/* Disallow: /hi/* Disallow: /hr/* Disallow: /id/* Disallow: /it/* Disallow: /iw/* Disallow: /ja/* Disallow: /ko/* Disallow: /lt/* Disallow: /lv/* Disallow: /mr/* Disallow: /nl/* Disallow: /no/* Disallow: /pl/* Disallow: /pt-br/* Disallow: /pt/* Disallow: /ro/* Disallow: /ru/* Disallow: /sk/* Disallow: /sl/* Disallow: /sr/* Disallow: /sv/* Disallow: /tl/* Disallow: /tr/* Disallow: /uk/* Disallow: /vi/* Disallow: /zh-CN/* Allow: /
As far as I know, Google penalizes for duplicate content. Translated version of your page is considered duplicate content so for SEO benefit it is best to use this method to block access to the translated version of a web page.
It took about two weeks for all the errors to go away from my Google webmaster tools account but the number of errors started to go down as soon as I updated my robots.txt file to block the spiders from crawling all the translated version of the site. Hope this helps.
Tags: Google, google webmaster tools, Troubleshooting, Web development, web masters, Web Robots, Wordpress








#1 by Ubalin WebBlog on November 3, 2009 - 1:22 pm
Thanks for the tips, it is very useful
Ubalin WebBlog´s last blog ..11 Langkah sukses untuk submit ke DMOZ open diectory
#2 by Altis Lo (Beaulife) on December 25, 2009 - 1:15 am
Thank you for your great sharing, to me this is an awesome information to enhance my blog.
[Delighting LIfestyle] Best Buy And Idea | Blog And Store.
Follow me at Twitter.
#3 by turisuna on January 10, 2010 - 10:43 am
Hmmm it sounds little bit complicated. Sometimes I found some error reports from google webmaster tools but not too much, so I usually fix it manually and doesn’t take too much time. But thanks for this information
#4 by Forum Indonesia on January 11, 2010 - 2:14 pm
Thanks a lot for sharing the tip. It is certainly a lot useful. Most of the times I used to fix it manually but now I will use this tip
#5 by Raul Gonzelous on January 30, 2010 - 5:20 am
Thanks for the great article it is really useful you should also deny access to all inside folders
#6 by girisim on February 5, 2010 - 12:12 pm
I have read all the articles. Very useful information was written. Thanks