SEOWordpress

The ideal robots.txt file for WordPress

By 21st June 2018 No Comments
The ideal robots.txt file for Wordpress Purple Web Marketing

A few years ago I wrote what is easily my most popular SEO blog post ever, and it’s just as relevant now as it was then. Here it is, updated, in all its glory.

In 2015, Google started sending out emails to untold millions of website owners via their Search Console, saying “Googlebot cannot access CSS and JS files”. The full text of the error on Search Console is:

Google systems have recently detected an issue with your homepage that affects how well our algorithms render and index your content. Specifically, Googlebot cannot access your JavaScript and/or CSS files because of restrictions in your robots.txt file. These files help Google understand that your website works properly so blocking access to these assets can result in suboptimal rankings.

In plain English, it’s saying Google cares a lot more than it used to about how your site looks. For a long time the system used to index sites was akin to a text-based browser. Sure, it took screen shots but the underlying scan was very simple.

Now it looks at your site much more like a normal person would. This means how it looks is more important than ever, as is making sure your pages are well structured and laid out.

Google’s spider become more like humans every day.

Fortunately there’s a simple fix. The solution is to update your robots.txt file – that’s the small file that gives instructions to search engines about what to look at, and what not to.

Here’s how mine looks now for my WordPress website:

#Googlebot
User-agent: Googlebot
Allow: *.css*
Allow: *.js*

# Global
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Allow: /wp-includes/js/
Allow: /wp-content/plugins/
Allow: /wp-content/themes/
Allow: /wp-content/cache/
Disallow: /xmlrpc.php

 

As a general rule, you could copy that to your own WordPress site and it should work fine. There will always be themes and custom setups that require a bit more investigation and finessing, but for 95% of WordPress sites, that will be perfect.

Just as a bit more explanation about what it does. It tells Google specifically it can index all CSS and Javascript (JS) files. Then the second section tells all search engine robots they can all the folders we expect to find those files in WordPress anyway, while also blocking the bits those robots and spiders never need to see.

As a case study, one client of mine who got this message was Big Ideas Machine. I used Google’s Search Console (this used to be called Webmaster Tools) to fetch and render their homepage, and was shown a list of files that Google wanted to scan but couldn’t, and these screenshots that clearly demonstrate that Google couldn’t index the site fully:

The ideal robots.txt file for Wordpress Purple Web Marketing

And then after I updated and resubmitted their robots.txt:

The ideal robots.txt file for Wordpress Purple Web Marketing

Et voilà.

Oh yes, and for those who want this information straight from the horse’s mouth, here’s Google own post about it: Updating our technical Webmaster Guidelines

Peter

Author Peter

SEO expert, web developer & designer, all-round nice guy, and ravenous information devourer.

More posts by Peter

Leave a Reply