How to Clean/Remove Not Found Errors from Google web master tools generated from translated versions

Categories: Troubleshooting

I installed a translator plugin on one of my WordPress blogs but the plugin wasn’t working properly so I disabled it but two days later I found out that my Google web master tools account was reporting about 1100 ‘Not Found’ errors under the ‘Web crawl errors’ section. All the errors were from translated versions of my blog. I used the ‘robots.txt’ file to fix this issue.

If you don’t know what a ‘robots.txt’ file is, then read the article titled how to control access of the web crawlers or web robots to your site.

Basically, add rules to your ‘robots.txt’ file to Disallow any spider from indexing the translated version of the pages. My ‘robots.txt’ file looks like the following Depending on your situation you might need to block more languages. Just look in the Google webmaster tools and see which languages are causing the error then add them to the Disallow rule.

User-Agent: *
# Language pages
Disallow: /ar/*
Disallow: /bg/*
Disallow: /zh-hant/*
Disallow: /ca/*
Disallow: /cs/*
Disallow: /da/*
Disallow: /de/*
Disallow: /el/*
Disallow: /es/*
Disallow: /fi/*
Disallow: /fr/*
Disallow: /he/*
Disallow: /hi/*
Disallow: /hr/*
Disallow: /id/*
Disallow: /it/*
Disallow: /iw/*
Disallow: /ja/*
Disallow: /ko/*
Disallow: /lt/*
Disallow: /lv/*
Disallow: /mr/*
Disallow: /nl/*
Disallow: /no/*
Disallow: /pl/*
Disallow: /pt-br/*
Disallow: /pt/*
Disallow: /ro/*
Disallow: /ru/*
Disallow: /sk/*
Disallow: /sl/*
Disallow: /sr/*
Disallow: /sv/*
Disallow: /tl/*
Disallow: /tr/*
Disallow: /uk/*
Disallow: /vi/*
Disallow: /zh-CN/*
Allow: /

As far as I know, Google penalizes for duplicate content. Translated version of your page is considered duplicate content so for SEO benefit it is best to use this method to block access to the translated version of a web page.

It took about two weeks for all the errors to go away from my Google webmaster tools account but the number of errors started to go down as soon as I updated my robots.txt file to block the spiders from crawling all the translated version of the site. Hope this helps.

  • Share/Bookmark
Tags: Google, google webmaster tools, Troubleshooting, Web development, web masters, Web Robots, Wordpress

Similar posts that you may like

Subscribe to Tips and Tricks HQ to stay informed

twitter_icon

6 Comments

  • #1 by Ubalin WebBlog on November 3, 2009 - 1:22 pm

    Thanks for the tips, it is very useful
    Ubalin WebBlog´s last blog ..11 Langkah sukses untuk submit ke DMOZ open diectory My ComLuv Profile

  • #2 by Altis Lo (Beaulife) on December 25, 2009 - 1:15 am

    Thank you for your great sharing, to me this is an awesome information to enhance my blog.

    [Delighting LIfestyle] Best Buy And Idea | Blog And Store.
    Follow me at Twitter.

  • #3 by turisuna on January 10, 2010 - 10:43 am

    Hmmm it sounds little bit complicated. Sometimes I found some error reports from google webmaster tools but not too much, so I usually fix it manually and doesn’t take too much time. But thanks for this information :)

  • #4 by Forum Indonesia on January 11, 2010 - 2:14 pm

    Thanks a lot for sharing the tip. It is certainly a lot useful. Most of the times I used to fix it manually but now I will use this tip

  • #5 by Raul Gonzelous on January 30, 2010 - 5:20 am

    Thanks for the great article it is really useful you should also deny access to all inside folders

  • #6 by girisim on February 5, 2010 - 12:12 pm

    I have read all the articles. Very useful information was written. Thanks

CommentLuv Enabled

Featured & Popular Articles

Tips and Tricks Hot Items

wordpress_estore_icon
wordpress membership plugin icon
infinity remix wordpress theme
wordpress_affiliate_plugin_icon