This guest post is by SEO expert Joe Williams.
I can’t tell you how many times I’ve seen this mistake. You might be making it without realising.
Sometimes it can cost you a couple of days’ traffic, sometimes a couple of weeks.
I even saw it cost one unfortunate business owner two months of free traffic, and several thousand pounds in revenue loss.
I’ve been an SEO consultant for seven years. I’ve SEO’d big brands
and small one-man shops, and this little (but really harmful) mistake
happens to the best of them. Thankfully, it’s very simple to fix.
When It Happens and How to Avoid It
Websites are most vulnerable to this traffic killer just after the release of a new design.
That’s because the web designer will want to please the client by
showing progress and getting feedback on different iterations.
Often, the web designer will create a subdomain for the new website
like: newdesign.example.com. This creates a bit of an SEO problem.
newdesign.example.com may get indexed by search engines, and this
creates duplicate content which isn’t any good for SEO.
So, if the web designer is savvy, he’ll block access for the
newdesign.example.com – by adding a robots.txt file. This is a two
minute job, and will prevent search engines from accessing the new sub
domain.
It’s a regular plain text file and will look like this:
User-agent: *
Disallow: /
So far, so good.
Google’s web crawler is known as Googlebot and its job is to discover
and index pages. It’s known as a user-agent. Before it can visit any
webpage, it must visit the robots.txt file to learn what areas it can
and can’t index. It follows these instruction to the letter.
In User-agent: *, the * acts as a wildcard which means the following rule below it applies to all user-agents (including Googlebot).
In this case, the forward slash in, Disallow: / indicates that all the content on the new subdomain should not be crawled or indexed.
Now for the Little Mistake that has Big Consequences
Typically when new designs get signed-off, they are often behind
schedule. So it’s usually a rush getting the new design live onto the
main website (e.g. example.com).
The designer will then copy all the files from the development
subdomain (e.g. newdesign.example.com), and typically this includes the
robots.txt file:
User-agent: *
Disallow: /
If the robots.txt file remains unchanged and goes live on the main
site, it’s like traffic workman holding up a big red stop sign.
During this time no (SEO) traffic will be allowed to go through. The
stop sign only changes to green and welcomes Google back when the
robots.txt returns to normal and the forward slash is removed, like
this:
User-agent:
Disallow:
Really easy to fix but a really easy mistake to make too, wouldn’t you say?
From my experience this happens 5 to 10% of the time – but
usually isn’t a big problem because it’s discovered within a day or so.
But for one client of mine, this had been going on for two months,
meaning they missed out on bucket loads of cash. Needless to say, they
were pretty mad with their web designer!
Why don’t you check your site right now? The robots.txt file always lives at the root of your website, like this:
The DailyBlogTips robots.txt disallows search engines access to two
folders. This is intentional to ensure search engines don’t find and
index technical pages within the folder.
And always remember, if you have a new website and you’ve used a
designer or developer to help you, check the robots.txt file when the
new design goes live.
4S IT Solutions offers a full range of IT services to support your business infrastructure outside of your data center, too. From networking to hardware to applications, our team of certified experts will monitor, manage and maintain your IT environment.
ReplyDeleteThis is our site please visit for further details...
https://4sitsolutions.com/