Avoid duplicate content penalties and improve SEO

Avoid duplicate content penalties and improve SEO

  • Knowledge needed: Intermediate understanding of HTML and Google Webmaster Tools
  • Requires: Text editor, browser
  • Project time: Dependent on complexity of site but usually no longer than an hour

Sebastian Cowie covers some of the fundamentals surrounding duplicate content and gives away tips and tricks to make sure your site doesn’t suffer from these issues

Unless you’ve been living under a rock for the past year and a half, then you’ll probably have heard of the devastation that Google Panda has left in its wake. Affiliate marketers and SEO agencies participating in less than kosher tactics have seen their organic visibility plummet resulting in lost revenue, custom – and in extreme cases their livelihood.

What is Google Panda?

Google Panda (originally known as Google Farmer) was unleashed upon the world back in February 2011. The prime objective of this update was to address the ever increasing level of spam prevalent within Google’s index and increase the user experience.We’ve recently been subjected to another data refresh of the Panda update on the 20 August, making this latest release Panda 3.9.1.

As a result of this update, there were two major changes to the fundamentals of SEO and website design:

Off page

Removal of low-quality link building strategies from article and content farm sites.

  • Article sites including: Ezine Articles, GoArticles, and the like took a substantial hit.
  • How-to sites / low quality wiki’s and unmoderated blogs and forums were also hit particularly hard.

Subsequently, if the majority of your link profile originated from these sites, then you would have seen a decrease in organic visibility. Your site may not have been directly hit by Panda, but because of the drop in authority from these sites you may have been hit in the fallout.

On page

Sites must have unique content that provides users with informative and engaging content (read: quality content).

  • Crack down on sites with duplicate content, content thin sites, or MFA sites (affiliate sites and ecommerce platforms were hit particularly hard).
  • Sites with an abnormally high ad / content ratio
  • Restrictions on automatically generated pages (auto-blogs / aggregator and ecommerce platforms again).

If you were hit by Panda, as a considerably large proportion of webmasters were (11.8 per cent of queries in the United States were affected by Panda 1.0), then read on and discover how you can improve your on-site SEO.

Developer tips and tricks

Design and development agencies and SEO agencies were often seen as two entirely separate entities, with neither holding much regard for the other due to the various changes in process that each party required. However, as SEO and optimised platforms are quickly becoming the norm, most agencies are now factoring in some, if not all elements of the design, development and optimisation process when creating a new build.

Owing to the uproar surrounding the Panda update, Google published a guide on “building high quality sites” – targeted primarily towards content quality, but it also broaches a number of points that designers and developers need to take into consideration:

Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?

We’ve all know that duplicate content was the enemy, but further scrutiny is now in place for those sites that are duplicating elements of content on multiple pages that may not provide additional information or enhance the user experience.

Ecommerce example and tips

Ecommerce platforms are particularly vulnerable to Panda, as more often than not, the platform that you’re using will have multiple paths or query strings that will enable you to get to the same product and in some instances the same product with a minor modification.

Let’s pretend you’re running an online shoe shop. Your shop has multiple variables for each shoe including colour, size and even shoe lace type. Your URL structure and number of possible duplicate pages is dependent on how your ecommerce platform handles these variables.

Most modern day ecommerce platforms have the capacity to handle the above scenario correctly, although you may have to purchase or install 'SEO' plug-ins. I am finding that dated platforms, or ones that have been poorly configured and installed, may find their site being indexed on a number of URLs:

  1. http://www.mywebsiteaboutshoes.com/thebestshoesever/3UK/blue/
  2. http://www.mywebsiteaboutshoes.com/thebestshoesever/blue/3UK/

In certain instances, if you were to start your search based on shoe size and colour including laces, you may find the following or similar URL structure.

  1. http://www.mywebsiteaboutshoes.com/3UK/blue /thebestshoesever/2inchlaces/

How is this relevant to Panda?

While Google is getting better at differentiating between query strings and parameters that offer the user something new and those that track actions, it's still not perfect. Help is at hand however.

There is a range of potential ways to resolve duplication within your site and help Google along its merry way.

Implementation of canonical tags

While the canonical tag is only a hint and not a directive, most major search engines will attempt to utilise the data within this tag. Read more.

“Noindex, Follow” meta implementation or Robots.txt

To try and ensure a nice flow of PR through the site the “NOINDEX, FOLLOW” meta tags should be used whenever possible over a robots.txt.

  1. <META NAME="ROBOTS" CONTENT="NOINDEX,FOLLOW">

This may not be possible in all instances so instruction within a robots.txt is a suitable alternative.

Specifying URL parameters within GWT

An effective way to remove duplicate content fast is to utilise the URL parameters feature within GWT. Selecting 'No' effectively tells Google the content is duplicated and 'Yes' means that it should be indexing the content under that parameter.

GWT is a developer’s best friend.

Looking at the duplicate title / content feature:

Design and content tips to increase trustworthiness

Grammar and spelling

Although not a direct design issue, poor spelling and grammar could account for millions of pounds of lost sales around the globe, according to this BBC News article. Ask yourself, would you purchase something from a site with poor spelling and grammar? Or would you associate it with a possible scam or offshore site, thereby reducing the trust factor of the site substantially?

Google Panda misconceptions

'Create good quality content and you’ll rank well in SERPS'

SEO evangelists are often ringing the ‘build good quality content’ bell and the theory behind generating high quality content that engages users is sound. However, the philosophical question 'If a tree falls in a forest and no one is around to hear it, does it make a sound?' rings true in this instance.

If you neglect to promote your content through social or traditional (read: link building) mediums and have no existing user base, then creating good quality content simply isn’t enough.

'Panda looks at the amount of content above the fold'

It’s all too easy to bundle the Panda update with the 'Page layout algorithm' update because they focus on content and user-experience; however, they were separate updates.

Conclusion

Have you been affected by either duplicate content or the latest rollout of Panda? Or were you hit by a previous release of it and are still struggling to recover; maybe you’ve seen a recovery over the past few days? If so I’d love to hear what you’ve tried so far to try and revive your site from any penalty it may have been suffering from.

11 comments

Comment: 1

I've always thought it a little harsh to penalise e-commerce businesses because the platform their using is ill-suited to the job when it comes to SEO.

The great thing is that developers can't just produce software without considering the future of the businesses who will use their platform - at least not without losing customers.

Great round-up - may use this as a reference for clients who sometimes struggle to understand why their system may not be ideally suited for SEO.

Comment: 2

Great Article. just producing good Quality content wihout any promotion won't work either. You have to get good links and give informative and unique content to readers as well. My website has not been directly affected with the latest panda update. but with the changing scenario maintaing website ranking is getting tougher. Article directories will suffer just as directoru submission Did . Quality links+ Genuine readers+Good links is the way to keep Panda Happy.

Comment: 3

A friend specialised in SEO displays content that he knows to be duplicate (e.g. product description from manufacturer's website) using JavaScript, so Google doesn't pick it up. If you browse his site with JS turned off, you get told to turn in ON in order to see the description.

The reviews (original content) are always there.

He seems to think this is acceptable, as it doesn't target Google specifically. Is this technique acceptable? And should we all be using? It sounds like a good idea...

Alix

Comment: 4

Helped me this article about SEO.
Thank you!

Comment: 5

Plagiarism checker tool is helpful in preventing the unethical practices of content theft and copying the coursework of others.

Comment: 6

SEO is getting difficult day by day because of Google latest updates. It's really a good educative article and I've learned quite good lessons about SEO improvement reading this post. Tucson SEO Thanks for providing quality information and I'll keep checking your blog for more lessons about duplicate content penalties and improved SEO.

Comment: 7

As an affiliate marketer I must say the latest google update has most definitely made it harder to rank a site.....and even harder to keep it there!

Tycoon Gold addon Review

Comment: 8

Can any body tell me if I move a site from old address to new address, and had informed it in webmaster tool does it still count as duplicate content?

Comment: 9

Can any body tell me if I move a site from old address to new address, and had informed it in webmaster tool does it still count as duplicate content?

Comment: 10

Nice organisation of the most important aspects needed in making sure a site complies with Google's latest policies. We found that a site of ours OneDirection.net was hit last year, and we spent approx 6 months attempting to clean the sites structure and content. We had never received any link penalties or anything, but found that our forum and Wordpress pages contained a lot of duplicate content.

Addressing the issues both through better Wordpress setup and also via robots.txt seemed to help us.

Later last year we also launched a custom mobile version of the site which used a different Wordpress theme to make it easier to manage. Same content levels, but slightly reduced via an uncluttered interface for the mobile version. However if you're thinking of following the same mobile approach, make sure you check that you don't essentially have duplicate mobile content. Use canonicals to avoid getting penalised for having two versions of each page (ours is differentiated via querystrings, so at first we were worried about duplicate pages).

There's a big technical gap in most websites these days. Yes content is always key and always should be - but don't overlook the technical side of any website. Page speed also comes into its own here so make sure your server is optimally configured as well as the structure issues mentioned above.

Chris.

Comment: 11

I thinks the google Panda is algorithm that change search engine ranking result.And Google Penguin is algorithm that penalizes the site for low quality content.
Web Marketing Services
June issue on sale now!

The Week in Web Design

Sign up to our 'Week in Web Design' newsletter!

Hosting Directory
.net digital edition
Treat yourself to our geeky merchandise!
site stat collection