The key to being well referenced can be summed up in two words: quality and, above all, originality. As you know, the algorithms of the various search engines, Google, in particular, analyze more and more finely the editorial content published on your sites, your blogs, or your e-commerce platforms. In their sights, duplicate texts which, when detected on your site, risk handicapping your positioning on the SERPs. How do identify them? Our advice.
What is duplicate content?
Duplicate content, or duplicate content, is identical content, a “copy-pasted” text, for example, that can be found in several places on the net. These contents, to be considered truly duplicated, must each be accessible from a different URL address (even if the difference is minimal).
Google precisely defines what it considers duplicate content: “Duplicate content usually refers to considerable blocks of content, contained by a domain or in different domains, which are either completely identical or very largely similar”.
In practice, as the definition of the famous search engine points out, there are two types of duplicate content:
- Internal duplicate content, i.e., present on the same site. It can, for example, be two identical product sheets on an e-commerce site.
- Of greater concern is external duplicate content. The identical content is accessible on two pages belonging to different domains. It may be pure and simple plagiarism, to try to deceive the engines. It can also be an involuntary error, for example by forgetting to tag a quote correctly (the block note tag for long quotes) or as part of a poorly managed link-building strategy (particularly through directories).
The publication of duplicate content is not always the fault of the publishers: it can have technical causes. Thus, the poor configuration of certain Content Management Systems (CMS), in particular WordPress, can make an article available under several URLs. Other causes, are the double indexing of a site (with or without wwHTTPhttpHTTPSttps), the existence of several URLs for the home page, or the preservation of an old URL without redirection after the latter has been “rewritten”.
What are the SEO risks associated with duplicate content?
Since the appearance of the Panda algorithm at Google in 2011, an algorithathich aimed to fight against content farms, there is a real risk of seeing your SEO penalized if your site concentrates too many duplications and these mistakes should not be Wikipedia writing service
An impact on the ranking in the SERPs
Engines are not born from the last rain… They know how to identify duplicate content. When an engine spots duplicate content, both internal and external, it will proceed to verify the origin of the latter. Direct consequence: after having found the “source” page of the original content, the engines will downgrade, or even de-index, the pages which merely copied the content. Less well ranked and ed, invisible on the SERPs, your pages will no longer produce the expected traffic to your site.
In addition to the risk of making your pages invisible, duplicating content can also, in the case of external duplicate content, expose you to legal consequences: we are talking here about plagiarism, which constitutes an infringement of the copyright.
Why do engines dislike duplicate content?
Why so much hate? For two simple reasons. The first is that indexing duplicate content is an additional source of work for the engines. Indeed, the robots already have a lot to do to index the immensity of the web so that the analysis of identical content is added to it. For the engines, it is simply a waste of time, since the engine will have to identify the source of this content.
The other reason is mainly due to the philosophy of the engines. Google, in particular, emphasizes the experience of its users: the content it references must provide qualitative information with high added value to be considered worthy of appearing in the SERPs. This is not the case with duplicate content.
How do detect duplicate content?
We mentioned it above: the vast majority of duplicate content is not intentionally published, often due to poor technical settings. To make sure you’re not posting duplicate content, there are a few simple tests to perform.
First, by using a free tool, Mozart (only compatible with Chrome). Thanks to it, you can first check that your site is not indexed under several addresses. To detect the different URLs through which it is accessible, you just need to type the different URL versions (with or without www., HTTP, or HTTPS) in Mozart. If you can access your site through several URLs without redirects, it is duplicated.
Besides Mozart, some tools will focus on the content you have published:
- Google Search Console can help you identify identical content on your site. If you use this platform, go to the “Appearances in search results” menu, then to “HTML enhancements”. You will have access to a report that will show you the duplicate content.
How do avoid duplicate content?
The first solution to avoid having to eliminate duplicate content on your site is to “hide” it from the eyes of the engines. At the technical level, some effective solutions can be implemented:
- Configure your CMS correctly so as not to make your content accessible under several addresses. In addition, some plugins, such as Yoast SEO for WordPress, allow you to tell the engines which pages should not be indexed: make sure to list those that offer duplicate content.
- Work on the redirects on your site. Send, in particular through a “301” redirect, the page that publishes duplicate content to the page of the original content. By accompanying the indexing robots in this way, you will facilitate their work and, consequently, you will improve the positioning of this page on the SERPs.
- Choose a canonical URL: if some of your pages are accessible through several URLs, tell the engine where the original content is. Inserted in the source code, the rel=canonical tag indicates to search engines the only URL to consider in all the pages with identical content. Using this tag is also a good way to counter the negative effects of duplicate content.
Another solution to avoid duplicate content is not to write it, by developing a real editorial strategy based on original content. We know, especially when you don’t have a team dedicated to communication, it can be tempting to cut corners on the editorial by copying and pasting certain content (product sheet, meta-description, seasonal articles, etc.).
In the long term, this lack of editorial investment may have a lasting impact on your SEO. In short, if you dry up to writing, that you don’t have time to focus on your content, why not get support? Do not doubt it, a web editorial agency will meet all your needs.