Correctly handle web site upkeep for search engine optimization • Yoast

0

Joost de Valk

Joost de Valk is the founder and Chief Product Officer of Yoast. He is an internet entrepreneur who, in addition to founding Yoast, has invested in and advised several startups. His core competencies are open source software development and digital marketing.

Sometimes your website takes some downtime so you can fix things or update plugins. Most of the time, this is a relatively short period of time that Google is most likely not trying to crawl your website. However, in case it takes more time to fix things, the chances are much higher that the GoogleBot will come over and be presented with a failed website. So how can we keep Google from derailing your website?

HTTP status codes and you

For those unfamiliar with HTTP status codes, here is a quick summary of the codes that will apply to you in site maintenance:

  • 200 OK. This status code indicates that the server successfully returned a response.
  • 301 Postponed permanently. This tells the browser that this page is no longer valid and will be redirected to the correct page.
  • 302/307 Postponed temporarily. There’s a bit of history behind these two HTTP status codes, but what dictates the browser is that you temporarily redirect the browser to another page and that the current URL will eventually be restored to its previous state.
  • 404 Not found. This status code means that the page you tried to navigate to could not be found.
  • 410 Content deleted. Use this when you’ve purposely deleted your content and there won’t be a replacement. Learn more about how to properly delete pages.
  • 503 Service not available. This is the one that you want to give back to Google for website maintenance. It tells Google that you are actually working on this page or that something else went wrong. Google knows that if this status code is returned, the page will have to be checked again later. We will discuss this a little more.

Please note that despite an error (or very little content) on the page, Google regards pages that return the HTTP status code 200 as a “soft 404” in the Google Search Console.

Read more: HTTP status codes »

Google tell you are busy

If Google encounters a 404 error while crawling your website, it will usually remove that page from search results until the next time it comes back to see if the page is available again. However, if Google repeatedly encounters a 404 error on that particular page, the re-crawling might be postponed, which means it will take more time for the page to show up in search results.

To avoid this potential prolonged ranking loss, you need to return a 503 status code every time you work on a particular page. The original definition of the 503 status code under this RFC is:

The server is currently unable to process the request due to a temporary overload or maintenance of the server. This means that it is a temporary condition that will be alleviated after some delay. If known, the length of the delay CAN be specified in a retry-after header. If no retry-after is specified, the client SHOULD treat the response as if it were a 500 response.

This means that a 503 is returned in combination with a Retry-After header that tells Google how many minutes to wait before coming back. These Not means Google will crawl again in exactly X minutes, but it ensures that Google doesn’t come back to take a look beforehand.

Add a header

If you want to implement the header, you have a few options.

Use the WordPress default settings

By default, WordPress already returns a 503 when updating plugins or the WordPress core. WordPress allows you to override the default maintenance page by adding a maintenance.php to your wp-content / directory. Note that you will then be responsible for the proper return of the 503 header. Are you planning a database maintenance? You have to take care of that too. Add a db-error.php file to your wp-content / and make sure you are correctly returning a 503 header here as well.

If you want to add something a little fancier to your WordPress website, check out WP Maintenance Mode. This plugin also adds a lot of extra features besides what we mentioned in the previous section.

If you’re just writing your own code and want a solution that is easy to implement, you can add the following snippet to your code base and call it up in the code that will determine if you are in maintenance mode:

Function set_503_header () {$ protocol = “HTTP / 1.0”; if ($ _SERVER[‘SERVER_PROTOCOL’] === ‘HTTP / 1.1’) {$ protocol = “HTTP / 1.1”; } header ($ protocol. ‘503 Service not available’, true, 503); header (‘Repeat-After: 3600’); }

Note that the 3600 in the code snippet specifies the delay time in seconds. That means the example above tells GoogleBot to come back after an hour. It is also possible to add a specific date and time in Retry-After, but you need to be careful with what you add here as adding an incorrect date can lead to unexpected results.

Pro tips

Caching

There are a few things to keep in mind when working with maintenance pages and returning 503 status codes. If you are actively using caching, the cache may not properly pass the 503 status. So please test this properly before actively using this on the live version of your website.

Robots.txt

Did you know that it is also possible to return a 503 status code for your robots.txt file? Google states in their robots.txt documentation that you can temporarily suspend crawling by throwing a 503 on your robots.xt file. The biggest advantage here is the lower server load during maintenance times.

Take good care of your maintenance!

As we’ve seen, you can avoid losing ranking by adding a 503 when maintaining the website to let Google know that it can crawl your website again later. There are several possibilities for this. Choose what works best for you and you’ll get a well-maintained website without the risk of losing rankings. Good luck!

Read on: Which Forwarding Should I Use? »

Leave A Reply

Your email address will not be published.