+1 905 315-3455
24/7/365
Top

The Truth About Duplicate Content and Its Impact on SEO

HostPapa Blog / Marketing  / The Truth About Duplicate Content and Its Impact on SEO
Duplicate Content and Its Impact on SEO
3 Mar

The Truth About Duplicate Content and Its Impact on SEO
Share

(Last Updated On: February 21, 2020)

It’s not unusual for content to appear in multiple places on a website, on other websites, and social media. A well-crafted blog post, company story, or product description is sometimes so hard to create, that when you do come up with one, it’s tempting to use it everywhere.

Here’s some good news: duplicate content is OK. It won’t negatively impact your SEO.

But there’s a right approach to content duplication and one that could result in less value for your site visitors.

In this article, we’ll explain the impact that duplicate content can have on a website. We will start with a practical definition of duplicate content, go over some common misconceptions about the topic, delve into internal and external content duplication, and close with some advice on how to handle duplicate content.

What Is Duplicate Content?

Duplicate content consists of two or more instances of the same content in multiple places on the internet.

It can exist because a site owner has created it on purpose, it can be the result of plagiarism, or it can emerge as a side effect of website mismanagement.

When you intentionally reuse text in multiple places on your site and social media, you’re creating duplicate content. In the case of plagiarism, a site owner often finds that a competing website has published an exact copy of their content.

Those are both examples of external duplicate content, meaning copies of content appear outside the site where the original, or canonical, version is published.

Internal content duplication involves copies of content that appear on your website. In some cases, it may be intentional—a site owner can reuse a carefully written value proposition, word-for-word, in multiple places on the site.

In other cases, internal content duplication is something that happens as a result of using boilerplate text. Take an online store, for example. To seed a hundred product pages with text, a template or an automated process will likely place boilerplate copy on every page.

The problem comes when that content is not sufficiently edited on each page, resulting in a large number of pages that—aside from a unique product blurb—contain exactly the same text.

Image credit: Keystonecopy
Image credit: Keystonecopy

This sort of redundancy can extend to a site’s metadata tags and URLs, which can confuse search engines and cause the wrong pages to be returned in search results.

We’ll go a little further into internal content duplication later, but first, let’s look at some common misunderstandings about content duplication in general.

Misconceptions About Duplicated Content

Duplicate content has long been a confusing topic for website owners. This is partly because duplicate content can take many forms and only some of them are intentional.

In this section, we’ll attempt to dispel the common myth that duplicate content has a direct, negative impact on SEO. We’ll also point out some aspects of content duplication that have a very real downside.

Does Google Penalize Sites with Duplicate Content?

Some site owners worry that duplicate content is in violation of Google’s guidelines and that, if their content is duplicated, their website will be penalized.

In reality, there’s nothing to fear.

There are many ways that duplicate content can be used with positive intent, such as in eCommerce product listings, canned postings for discussion forums, or a printer-friendly version of a web page. Content duplication of that kind and duplication due to structural problems and overuse of boilerplate text is not seen as a negative in terms of SEO.

In extreme cases, Google will respond to what it perceives as malicious content duplication that was done to manipulate rankings and deceive users. The search engine will lower the ranking of the sites involved.

When it comes to external content duplication, Google cares about which site published the content first. The search engine crawls most sites daily, so if a version of your content appears somewhere else after it was originally published, your SEO will remain intact. Google will rank the site with the canonical version higher in search results than a site with a copy of the content.

Internal content duplication also carries no penalty, but it can significantly hamper your ability to control which of your pages gets linked to in search results, as we’ll discuss in the next section.

What Impact Does Duplicate Content Have on Search Rankings?
Image credit: E-Trend Talk

What Impact Does Duplicate Content Have on Search Rankings?

With malicious content duplication, as in cases of plagiary, there is no actual SEO penalty, but in many cases, a site that unethically publishes your content will violate more significant search engine guidelines that aren’t related to content duplication.

Imagine a popup Amazon affiliate site that shamelessly duplicated content from your site and several others in your niche. What pushes a site like that to the bottom of search rankings is the fact that it has zero authority, offering no value to users. One way to look at it is: duplicate content, while bad, doesn’t even make the list of reasons why Google penalizes value-less websites.

Oddly, it’s internal duplicate content that can have the greatest impact on how your site appears in search results. It won’t affect ranking, but it can make search engines link to the wrong page for a given keyword search.

When crawling your site, finding the same text on many pages can get Google confused. Repetitive metadata tags and boilerplate text on every page plus a redundant URL structure spanning many categories can all contribute to Google picking the wrong page to return in search results.

When it comes down to it, sending visitors to the wrong page of your site is probably the biggest downside to having mismanaged duplicate content. We’ll elaborate on that in the next section.

What Harm Can Duplicate Content Cause?

External duplicate content, if created intentionally, can’t cause any harm, but you should identify which version of your content is the original, as that will be the version that gets indexed.

Internal content duplication can cause Google to link to the wrong page on your site. 

The search engine uses complex algorithms to determine which pages to index and which pages will be returned for a given keyword. If a site contains many pages with the same metadata and boilerplate text, or if there is a complex, repetitive website structure, the analysis that goes into that determination can be thrown off.

When Google encounters pages with duplicate content, it will pick one to index, but it might not be the one you want to be indexed.

Image credit: indiamart

Internal and External Duplicate Content Issues

Let’s dig a little bit deeper into some of the problems you may encounter with content duplication.

Internal Content Duplication

Some content duplication can be tied to website management factors like the existence of multiple versions of a site, the hierarchical site organization needed to present a large amount of information, and unnecessary duplication of boilerplate text.

Regardless of why it exists, you should try to eliminate internally duplicated content wherever you can.

Let’s say your shipping policy or warranty statement appears on every product page. Those blocks of text should be replaced with links to appropriate detail pages. There are simple solutions like that for most internal content duplication issues. We’ll go over those in the last section when we get into best practices.

There’s another kind of internal content duplication that has nothing to do with the challenges of managing numerous pages. It’s duplication that comes from repurposing your messages in too many places on your site. 

When you come up with a precisely worded version of your value proposition, there are a lot of places that text should go, but it should not be plastered all over your site. 

Repetitive content is not pleasant for the user. It adds no value. Even though you won’t be penalized by search engines for having it on your site, it will detract from the user’s experience. That’s reason enough to avoid reusing your content in the wrong way.

External Content Duplication

When multiple versions of your content appear around the web, it can be because you wanted it that way, or because someone has stolen your content. In the case of the former, there are some guidelines you should keep in mind. If the latter occurs, you may have to take legal action to correct the problem.

Let’s look closer at these two scenarios.

Intentional Content Duplication

Your website is like the mothership for your content, but to create awareness among your audience, that content has to go out into the world. Guest blog posts, Medium, your social media accounts—these are all high-visibility channels for your content and taking advantage of every channel is smart marketing.

The opportunity to disseminate your value proposition to a wide group of people is highly valuable, deserving your most finely tuned messaging, even if that text already exists on your site.

If you can make each form of outreach somewhat unique, your reused messages will not seem redundant, and after following a link to your site, the user will find a reassuring uniformity in your communication. 

Plagiarism and How to Respond

Finding out that your content has been stolen is no fun. You’re happily doing keyword research using the Google search console and up comes an unfamiliar link to your latest blog post. You click in and sure enough—it’s a word-for-word copy, with no attribution.

There are two ways you can react to plagiarism: sue, or let it go.

The severity of the infraction should dictate which you choose.

If your entire site is cloned, for example, you will want to take legal action. Likewise, if a close competitor has published barely edited versions of your content, you will want to put a stop to that.

However, if a fly-by-night website has grabbed a portion of your content, you may simply choose to ignore it. That site will rank lower than yours, probably for multiple reasons, but the main one is that yours is the canonical version of the content.

Duplicate Content Best Practices

Managing issues that arise from internal content duplication can be accomplished with solutions like not indexing certain pages, using 301 redirects, and applying canonical URL tags.

In many cases, however, you may realize that removing the duplicate content is a better solution. For example, when using a 301 redirect, the page will become invisible to search engines, in which case, you should ask yourself if the page actually has a useful purpose.

The canonical tag is a better way to hold onto valuable duplicate content but not allow it to interfere with the accuracy of search results. When you identify the original version of the content, a search engine will always know which one to index.

The best way to keep search engines from getting confused over duplicated content is to eliminate it by writing unique texts for every web page. You may be tempted to create spun versions of content where only slight modifications are made, but that won’t cut it. Google will view spun content as duplicate content. 

Best practice recommendations for external content duplication are straight forward—if you’re not intentionally spreading duplicate content around the internet, you should be.

Using multiple marketing platforms is a proven way to build awareness, and there’s no reason to create customized content for every channel. Nevertheless, there are some pitfalls in this approach.

Duplicate Content Best Practices
Image credit: Ryte Wiki

For example, having one of your blog posts published as-is on another website will likely drive traffic to your site, but there’s also a chance that the guest-post version of the content will begin to rank higher than the version on your site. That’s an unintended consequence of external content duplication, but it can be resolved by employing the canonical tag to identify the original version. 

When it comes to duplicating content on social media, go nuts! You can post your best content on all social media outlets, reaching out to more people on more channels, and doing no harm to your SEO.

Search engines index social media content differently from web content, so, even if there were a downside to benign duplicate content, a word-for-word copy of the content on LinkedIn, Twitter, and Facebook would not be viewed as a negative for your website.

Manage Duplicate Content for Your Users, Not for Search Engines

Duplicate content can be a confusing issue, but, in a nutshell, it can be summed up like this:

Internal duplicate content should be minimized; external duplicate content is good, provided it’s intentional.

Oh… and neither will hurt your SEO.

Outside your website, taking advantage of legitimate content duplication is a practical way to spread your information to a wider audience.

Keeping internal duplicate content to a minimum will not only help search engines index your site the way you want, but it will also improve the users’ experience.

While Google penalizes blatant content duplication that’s geared toward gaming the system, no typical form of content duplication conflicts with search engine guidelines.

The real reason you should avoid unnecessary content duplication and take advantage of legitimate external duplication is for the sake of your users. Getting unique content in front of more people is the ultimate goal and maintaining the right approach to duplicate content can help you achieve it.

María Bustillos

María is an enthusiast of cinema, literature and digital communication. As Content Coordinator at HostPapa, she focuses on the publication of content for the blog and social networks, organizing the translations, as well as writing and editing articles for the KB.

No Comments

Post a Comment