Not many people know that their WordPress blog is generating Duplicate content by default without their permission and as we already know that search engines hate duplicate content. If you ignore this problem and your blog continues to generate duplicate content than at some stage search engines will mark your blog as spam blog and ban your blog totally from search engines results. So this little mistake and ignorance can cost you heavily in future so why not tackle with problem today.

According to Google webmasters quality guidelines two sites can have same content but it should not be in the same website means no two pages with same content in one domain. Duplicate content problem is of many types like same page with many accessible URLs, Different pages with same content, comments pages, Admin pages, two sites within one site etc. Let me explain all this one and one and find the solution to solve the problem as well.

search engine optimization guide

1. Same Page with Many Accessible URLs.

By default you can access your website main page with different URls like www.example.com or www.example.com/index.html and www.example.com/category/ or www.example.com/category/index.php which are two way to access the same page. Same applies to all sub directories as well. So if your blog has 100 directories than this means that your blog must have 200 pages to access these 100 directories which is a big problem and search engines treat these as duplicate content. This is very interesting situation because its doesn’t look like duplicate content issue but because of different URLs search engines thought that these are different pages so they index all of these pages.

a. Solution for Websites.

In case you have a simple .html or .PHP home page and your blog has in some other folder like www.example.com/blog/ than you have to add some code into .htaccess file present in the root directory of your blog/website. All you have to do is to add code given below, customize it with your blog name, create a new file if not already present in the root directory and upload it. Please not that add this code at the top of the code(if any).

RewriteEngine On
RewriteCond %{HTTP_HOST} !^wwwexample.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

RewriteEngine On
RewriteBase /
RewriteRule ^index.html?$ / [NC,R,L]

b. Solution for WordPress blogs.

For WordPress blogs solution is very simply and fast. To tackle with all attachment URL issue you can use WordPress Robots Meta plugin. What this plugin will do is add a canonical URL in every page so that every attachment URL page will be redirected to Parent post URL which is good for your readers as well as for search engines. Simply install this plugin[link] and select option mention in the picture below.

wordpress canonical problem

2. Different Pages with same Content.

This issue is very common in WordPress blogs and create problem in long terms if ignored. For example when you write more than 10 articles your blog will automatically creates another page to accommodate the new upcoming articles and this process will goes on. Another example is tag URls, Category URLs, Author Pages, Archives etc. The result is many pages with the duplicate content. by default all these pages are indexed by search engines and hence create problem.

The solution is to add a meta tag in the header part of the each page and add noindex attribute in it. So when ever search engine bot crawl your blog, this tag will stop it from indexing this page into the search engine index. Again you have to use WordPress Robots Meta plugin to noindex all these pages. Simply check options mention in the image below and you are done.

add noindex attribute

3. Configure Robots.txt.

when ever search engine bot crawls any blog or website it searches for Robots.txt file first and if found than it follows the rules define in it. Robots.txt file is so powerful that it can block all your website being indexed in search engines. But we use it to block those pages and directories which are useless and contains duplicate content. For example when someone comment on your blog WordPress creates another page which mainly known as ReplytoCom pages.

Suppose if you have 100 comments on same page your WordPress blog it will generate 101 pages all with same content. So we have to tell search engines that don’t index these pages as these pages contain duplicate content. Same is the case with directories which are useless for search engine but we allow directories like Images folder which contain our website images. So simply copy paste code given below and paste it in your blogs Robots.txt file. If you don’t have one create a new one by simply opening Notepad, paste the code in it, save the file with name Robots.txt and upload it to the root directory.

sitemap: http://www.internet-khazana.com/blog/sitemap.xml

User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Disallow: /archives/
Disallow: /go/
disallow: /*?*
Disallow: /Category
Disallow: *?replytocom
Disallow: /wp-*
Disallow: /author
Disallow: /comments/feed/
Disallow: /tags

User-agent: Mediapartners-Google*
Allow: /

User-agent: Googlebot-Image
Allow: /blog/wp-content/uploads/

User-agent: Adsbot-Google
Allow: /

User-agent: Googlebot-Mobile
Allow: /

4. Two sites within one Site.

If your blog is new than you can access all its pages via two different URLs. For Example www.example.com and example.com will open the same page. Same is the case with all your articles and pages URLs. This is very unprofessional and search engines hates stuff like these. First you have to decide which one is more suitable for your blog and then apply it so that your blog will be accessed with only one way(URL).

The solution is very simply open your WordPress blog Admin page and sign-in. After that just click on Settings tab and than General link and a new page will open. Now enter full path of your blog address like www.example.com or example.com and click save button. Now your blogs home page will only be accessible via one URL and if someone types URl other than you mention he/she will be redirected to the page you have mention in the settings.

correct double site issue

5. Customize Images Settings.

You may have seen many blogs which contain lots of images on one page with thumbnails on it. If you want to see the picture in full size you have to click on it. After clicking the thumbnail you will be redirected to another page with full size image on it. The ugly thing in this process is that full size image page contains only image with no content on it. This doesn’t seems to have any problem but it looks unprofessional. So when ever you upload image into your articles make sure that you display full size image or if you want to link to it only link to the image it self not create another post which contain the image. Always select the options mention below.

which images URL to choose