If you’re someone who’s struggling to increase your website’s search traffic, you first need to find and fix duplicate content issues on your website.
Duplicate or copied content appears on the internet in more than one place. If you always make sure to find and fix such content issues on your site, you can definitely have better rankings along with a great website user experience.
So in this post, let’s talk about what it is, how you can find that problematic content within (and outside) your website and how to easily fix those content issues. Are you curious to find out? Let’s jump into the details.
Table of Contents
- Duplicate Content Issues: How to Find And Fix Them in 2020
- How to Find Similar Content On Your Website?
- How to Fix Duplicate or Similar Content Issues Easily
Duplicate Content Issues: How to Find And Fix Them in 2020
What is duplicate content?
It contains similar (or the exact same) content being on multiple pages. It can be found within your website (due to technical issues on your site) or outside your website (due to others copying your content).
There’s no point in keeping such problematic content on your website as it doesn’t add any value to your website audience or search engine crawlers.
Having multiple websites with almost the exact same text can confuse Google search engine crawlers and they only choose one among many duplicate sites to be ranked.
Here’s where you can use canonical URLs to prevent problems caused by identical or “duplicate” content appearing on multiple URLs (more on this canonical tag later in the same post).
To put it simply, always keep an eye on duplicate content issues on your website if you want to increase your search rankings and provide a better experience to the readers.
How does identical content occur?
There could be primarily 2 major reasons for such content to appear on your website.
- Technical causes
- Manually copied content
Let’s briefly talk about the above two reasons for such content so you can understand it better.
1. Technical mishaps: Even if you do not copy paste content from other websites and genuinely writing original content on your blog or website, content issues can still arise.
Yes, that’s true. It’s due to technical mishaps within your website. If you’re wondering about what they are and how they can arise, read on.
So let’s talk about some of the technical issues that can lead to content issues on your website.
- HTTP and HTTPS (make sure all your site pages are loading on https version, this issue happens when you don’t properly install SSL certificates)
- www and non-www (make sure all your site contents are loading either on www or non-www)
- Parameters and faceted navigation (faceted navigation might be useful for users, but it negatively impacts your website SEO, wasted crawl budget and so on)
- Session IDs
- Pagination (you should be using rel=prev and rel=next tags to handle these types of pages properly and make sure to check out this post from Search Engine Journal to learn more about handing pagination on your website)
- Scrapers (a scraper site is simply a website that copies content from other websites using web scraping, avoid such things at all costs)
- Different language versions (if your site is a multilingual website which means if your website offers content in more than one language, make sure to use Hreflang properly)
Try to avoid the above technical mishaps within your website and you’ll be safe from all the such content issues.
2. Manually copied content: Another biggest reason could be, either you’re copy pasting others content or other websites might be copying your content and publishing it as their own.
So you also need to keep an eye on manually copied content and make sure to not to use other contents as it doesn’t add any value to your audience. Similarly, whenever you find someone copying your content, make sure to send an email (by visiting their website or contact them through social media) to take that down.
Otherwise, you can simply file a DMCA complaint and it will work like a charm (more on this later on how to do it in the same post).
Is it bad for SEO?
Whether you know it or not, there’s no such thing as a “duplicate content penalty.”
Did you know that 29% of pages had duplicate or something similar issues across the internet?
According to a study by Raven tools, here are some interesting stats around duplicate blog’s content.
- 29% of pages had duplicate website content
- 22% of page titles were duplicate
- 20% of pages had low word counts
- 17% of meta descriptions were duplicate
So clearly, duplicate website’s content doesn’t cause your site to be penalized in Google search results.
Why, you may ask?
The reason is simple: Google is smart enough to know the original source of the content. Google tries to determine the original source of the content and display that one in search results instead of showing duplicate or copied content.
But that doesn’t mean you should copy paste articles from other websites.
Here are a few reasons why you should never use such content especially from other websites.
- Other blog owners can easily find who’s copying their content by using tools like Copyscape or simply searching for some of their content in Google search. Once someone finds that you’re copying their content, they’ll ask you to remove. If you don’t respond, they can easily take it down with the help of DMCA. So you’re not going to get away easily if you’re copying other stuff.
- Copying others content doesn’t add any value to your website site readers. If you’re not adding any value to your website audience, you’ll never succeed.
- Scraping others content is unethical. If you’re serious about making money from blogging, you should avoid such unethical practices as it can directly affect your authority online.
- Above all, as already discussed Google is smart enough to know the original source of the content, so it obviously gives higher rankings to the original source, not those websites that are copying others content. It’s as simple as that.
How to Find Similar Content On Your Website?
So far we’ve talked about what duplicate blog’s content is, how it occurs and why you should avoid it. Let’s now talk about the most important thing: how to actually find duplicate content on your website.
Again, finding such content can be done in 2 ways.
- One is: finding identical content within your own website (which happens mostly due to technical mishaps)
- Other is: finding duplicate or copied content outside your website.
Let’s talk about how you can find such content both in such cases.
Finding identical content within your website
Finding similar content within your website should be your primary focus as mostly it happens due to technical mishaps as already discussed above such as moving to https version and still loading some of the pages on http, using www vs non-www version and so on.
Apart from those technical issues, here are a few more ways to deal with spam content within your website.
Crawl your website for duplicate titles and meta descriptions
Whether you know it or not, the old version of Google search console was better where it used to provide you an option of “HTML Improvements” which helped you easily find duplicate titles and meta description. Since the introduction of the new version of Google Search Console, they got rid of this feature.
But here’s the thing: there’s another incredible tool called Visual SEO that you can use to crawl your entire website to easily find all the issues with your page titles, meta descriptions and H1 tags.
Here’s how it looks like;
As you can see above, this tool helps you find a ton of things on your website including;
- Pages with missing title tags
- Duplicate title tags
- Pages with missing meta descriptions
- Duplicate meta descriptions
- Duplicate H1 tags
- Short title tags
- Long title tags
- Short or long meta descriptions and so on
This will show you a general overview of your website and any significant duplicate problems which you can fix easily to avoid such content issues within your site.
Manual content checking using Google search
The easiest way to find similar content is to do a manual Google search.
Just make sure to find a post or page that you would like to check for plagiarism.
Now, copy a text snippet or paragraph from that page or blog post (that you might think others will copy) and insert that text snippet into Google search using double quotes (“).
It looks like this;
Google instantly gives you a list of results if that text snippet has similar content, otherwise, you’ll find zero search results for it (which means, there’s no identical content found for that text snippet).
Finding identical content outside your website
In the above section, we’ve talked about how you can find similar content within your website. Let’s now talk about how you can find spam content outside your content, which means, you’re looking for copied contents of your website.
Here’s where you should be using plagiarism checker tools to perform the task as you can’t always use Google manually for checking copied contents.
That being said, here are the top 3 tools to easily find whether other websites are copying content or not.
Although there are a ton of content checkers out there but Copyscape is one of the best tools for checking duplicate or spam content.
It works flawlessly. You just need to visit their website and enter your website URL. That’s it, it will search the whole web to find any websites that have similar content of yours. It will also show you how much of the text is copied along with the highlighted text.
2. Grammarly plagiarism checker
Grammarly is one of the most popular grammar editing tools and it also can be used as a plagiarism checker (you can do it even with their free version).
You can easily find plagiarism from Grammarly tool because it uses ProQuest databases and over 16 billion web pages to find scraped content.
Just visit this page and you can either enter some blocks of text from your website or upload a file to see whether it has any other duplicate copied across the web.
The good thing about using this tool is that, it highlights passages that require citations and gives you the resources you need to properly credit your sources.
This is another ultimate free plagiarism checker tool that works like a charm in finding duplicate content and scraped content. The best part about using this tool is that, it supports over 190 languages across the world!
All you need to do is to copy paste some of the content from your website and click on “check duplicate or copied content” (while selecting your preferred search engine i.e either Google or Bing) and the tool automatically starts finding for any copied articles with the same text.
How to Fix Duplicate or Similar Content Issues Easily
So far we’ve covered how you can find identical content both within your website and outside your website.
Let’s now talk about how you can easily fix such content issues.
Removing copied content from Google
The best way to remove duplicated content from Google search is to submit a legal request from Google.
Google offers you a tool where you can submit a legal request to remove duplicated (or copyrighted) content from Google search.
Here’s how it looks like;
You’ll see several Google services (choose accordingly where your content appears) so you can submit a removal request. These services include;
- YouTube videos (use this option if someone is using your videos without any credits)
- Image search (using your images without giving you any credit)
- Google My Business
- Web Search (you can search for copied or copyrighted content issues to take down such content issues from Google search)
- Blogger platform and so on
You can also specifically use “copyrighted removal” from Google.
Just visit this link where you can file a DMCA notice (Digital Millennium Copyright Act).
Here’s how it looks like;
As you can see above, you can provide the exact URL(s) where an example of the copyrighted work can be viewed. This will be used by their team to verify that the work appears on the pages you are asking them to remove.
Also, you need to provide the URL(s) of the allegedly infringing material that you are asking them to remove.
That’s it, you’re done. Within a couple of days (usually around 10 days), all the copied content will be removed from Google search.
Few simple ways to fix duplicate or similar content issues
Here are a few of the simplest yet most effective ways to fix duplicate or copied content issues on your website in 2020 and beyond.
Use 301 redirects
One of the simplest yet most effective ways to deal with copied content (or even thin pages) on your website is to use 301 redirects.
301 redirects simply telling search engines like Google that a particular URL has been permanently moved to a new location (new URL). 301 redirects include the URL address to which the resource has been moved.
There are a ton of plugins available for WordPress and you can use a free and simple plugin like Simple 301 Redirects plugin to redirect duplicate or thin quality URLs on your site to other relevant yet high quality pages on your website. You can also use Yoast SEO or Rank Math plugins for these redirections. The problem is solved!
Use canonical tag
A canonical tag (which is referred as “rel=canonical”) is simply a way of telling search engines like Google that a specific URL on your website represents the master copy of a page. That way, Google will rank only that particular page even if it finds other pages with similar content on your website.
If you cannot get rid of all those duplicate URLs, then you always have the option of redirecting them to a single URL.
You will need to add an extra tag in the head area of the duplicate page so that search engines like Google divert all the traffic towards the main article.
To put it simply, canonical URL helps prevent duplicate or copied content issues within your website on any particular content.
Setting canonical tag is extremely easy when you’re using WordPress SEO by Yoast plugin.
Yoast SEO WordPress plugin helps you easily change the canonical URL of several page types in the plugin settings.
Quick note: Use canonical tag from Yoast SEO plugin only if you want to change the canonical to something different from the current page’s URL.
Here’s how it looks like;
As you can see above, simply go to the settings from the Yoast SEO plugin and enter the canonical URL that any of your particular page should point to. You can also leave that field empty to default to permalink.
Tip: Make sure to check out this detailed tutorial on using canonical tag from Yoast website where you’ll find all the details about the usage of it.
Be consistent with your internal links
We all know how important internal links are. If you want to increase your website’s crawlability, improve the link depth, passing link juice to other pages on your site or get better rankings, internal links can really help you.
But here’s the thing. You must be consistent with your internal linking practices to avoid copied content issues.
For example, don’t link to http://www.example.com/page/ and http://www.example.com/page and http://www.example.com/page/index.htm.
You can also use search console to tell Google how you prefer your website to be indexed. That means, you can tell Google your preferred domain (for example, http://www.example.com or http://example.com).
So decide whether you want to index your website pages with www or non-www from Google search console.
Use SEMrush site audit
SEMrush is one of the best SEO tools out there that can help you with everything from keyword research to backlink analysis. But the main reason we’re mentioning SEMrush on this particular page is because it offers you an incredible feature called “site audit” which helps you find and fix all the technical and SEO issues on your website.
- Easily optimize your internal and external links
- Add meta tags wherever they are missing (including title tags, meta description, image alt tags)
- Easily find duplicate content pages
- Find and fix hreflang issues and the list goes on
If you’re looking for a free trial of SEMrush, use the below link and you’ll get it free for 14 days.
Click this link to grab 30 day free SEMrush Pro account (worth $99.95)
Quick note: You can also check out this detailed guide on performing site audits with SEMrush.
Use different summaries
As bloggers, we often depend on a wide range of platforms to promote our latest blog posts including;
- Social media sites
- Blog submission sites
- Blog directories etc
The key here is NOT to use the same summary of your blog posts across all those platforms. Instead, create unique entries or summary wherever you promote your blog post to avoid such content issues.
Also make sure to avoid empty pages on your website. For example, don’t publish pages for which you don’t yet have any content. If you do create such pages, make sure to use the noindex tag to prevent such pages from being indexed in Google search results.
What Is NOT Considered As Duplicate Or Plagiarized Content?
There are some instances where the same copy (exact text) will be available on the web but it’s NOT at all considered as duplicate or similar content. So what are those instances where it’s not actually considered as duplicate content. Here are a few of them.
Mobile version content
There are a ton of sites using mobile version of their website contents. Having the same content (including the articles, pages, products and so on) on your website along with the mobile site versions does not count as copied content.
Google is smart enough to differentiate the two versions (desktop and mobile site version) of the same website. So it simply doesn’t treat it as plagiarized content, so you can safely create a mobile version for your website without any issues. Same applies to AMP pages as well.
There are a few websites that use translate their content into multiple languages and translated content is NOT considered as duplicate or spam content (although the context is literally the same).
Why? Let’s find out what Google exactly think about copied content once. Google has defined duplicate content as “substantive blocks of content within or across domains that either completely matches other content or are appreciably similar”.
That means, translated content is NOT duplicate or identical content because it doesn’t match with other content.
FAQs About Dealing With Duplicate Or Identical Content Issues In 2020
Here’s a list of few important questions you might want to know about duplicate or spam content issues on your website in 2020 and beyond.
1. Is there a duplicate content penalty?
No, there’s no such thing as duplicate or copied content penalty.
If you’re curious, here’s what Google has to say about content penalty.
Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from such content issues, and you don’t follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.
We highly recommend you to find and fix such content issues because search engines like Google don’t know which pages to rank if you’ve duplicate contents within your website (due to technical issues which are mentioned above in the post).
That’s why it’s so important to find and fix all the such content issues within your website if you want to improve your organic rankings.
2. How to check plagiarised content online?
There are multiple plagiarism checker tools available online that you can use to easily find whether someone has copied content from your website or not.
The above tools are free to use (some of them also have premium versions which provide you higher limits and faster processing of content checking), so use them whenever you’ve doubts of someone copying your stuff.
3. Do duplicate page titles affect SEO?
Certainly yes. You should avoid creating duplicate page titles at all costs because your page titles (meta titles) matter a lot when it comes to ranking your page in organic search results.
Make sure to do a quick Google search for the title you’re going to use for your blog posts or pages. That way, you can avoid repeating or using the same page titles used by other websites. Use headline generator tools like Portent to easily come up with a ton of headline ideas.
Also make sure to create a unique and original meta description for every blog post and page you publish and index in Google search. Use plugins like Yoast SEO to create unique page titles along with the meta description instead of letting Google to pick random summaries of text of your posts.
4. Can duplicate content rank in Google search?
Gone are the days where few authority sites used to get higher rankings by republishing content from other websites. Now, Google is giving least priority to such duplicate content websites.
Let us cite the Google Search Quality Evaluator Guidelines from March 2017
The lowest rating is appropriate if all or almost all of the MC (main content) on the page is copied with little or no time, effort, expertise, manual curation, or added value for users. Such pages should be rated lowest, even if the page assigns credit for the content to another source.
So as you can see, duplicate content is given least priority while ranking. So make sure to focus on creating original, high quality and unique content to get higher rankings.
5. How does Google determine the primary version of duplicate content?
That’s an interesting question.
According to a highly regarded SEO event speaker Dan Petrovic, “If there are multiple instances of the same document on the web, the highest authority URL becomes the canonical version. The rest are considered duplicates.”
So there you go! You don’t have to worry whether Google ranks your content or not as long as you’re not copying others content.
The popular content myth is “Google penalizes the site with duplicate or copied content” – while it’s not entirely true but having such content can degrade your website user experience and you never know when does Google actually start penalizing the sites with duplicate content problems.
As they say “prevention is better than cure”, so it’s always better to solve those issues and we’ve talked about some of the best practices to find and fix such content issues on your website above.
Try to find and fix those issues on your website as early as possible and make sure to always keep an eye on duplicate or similar content for better search and user experience.
Do you’ve any more questions? Share your thoughts in the comments.