About Online Matters

PostHeaderIcon Technical SEO: Introduction to Site Load Times and Natural Search Rankings

It is one of those nights.  Those pesky technicolor dreams woke me up at 2:30 and wouldn’t let me go back to sleep.  But under the heading “turning lemons into lemonade,” at least I have some extra time to write my blog even as I am piled high with the end of month deadlines. 

Today’s topic is part of my Technical SEO series (I just named it that – now I have to go back and change all my titles and meta tags…sigh) – site load times and whether or not they effect how you rank in the SERPs.  It is another one of those topics that came out of SMX East.  In this case it was Maile Ohye, Senior Support Engineer at Google, who spoke to this issue.  Maile is a wonderfully knowledgable evangelist for Google.  I have seen her speak at many shows. Her presentations are always clear and contain good, actionable techniques for improving your rankings in Google’s SERPs. I am not alone in thinking her knowledgable.   Stephan Spencer, one of the guys I most look up to in SEO,  thought enough of Maile to interview her in August of 2007, and she was also recently interviewed by SEOMoz, another leading light in the industry (and if you haven’t used their pro tools, then you are one arrow short of a full quiver for your SEO work).   

So when Maile says “stuff,” I listen.  In her talk at SMX East, she made note that poor site load times (we are talking something between good and absolutely horrible) could harm your rankings in Google search results. Let me define the problem, then try to explain what Maile was referring to, and finally my take on all this.

Basic Concepts of Site Loading Times for Getting Indexed

One the one hand, that site loading times effect search rankings isn’t news.  Let’s take some time to lay a bit of foundation, because the how of site speeds effecting search rankings didn’t really hit me until Maile’s talk.  It’s one of those things that is obvious once you think about it, but it doesn’t really come top of mind when you are focused on specific tasks in an SEO project.  It’s a “given” in the background of your work.  Unless the site is so horribly slow that it is obviously impacting the user experience, you really don’t think about load times when you are focusing on keywords and meta tags.  The site works, move on. 

But that’s not really true from the perspective of the search bots.   Google and the other engines have to crawl billions of pages on the web on a regular basis, bring that information back, and then index it.  Some pages can be crawled infrequently, but as more of the web moves to more real-time information due to social media, the bots have to crawl more sites in real time in order to provide good results.  But there are only so many bots and so much time to crawl these billions of pages.  So if you are Google, you write your bots with algorithms that allocate this scarce resource most efficiently and, hopefully, fairly. 

How would you or I do this?  Well, if I were writing a bot, the first thing I would give it is a time limit based on the size of the site.  That’s only fair.  If you have the ability to create more content, bravo.  I want to encourage that, because it is beneficial to the community of searchers.  So all other factors being equal (e.g. site loading time), I want to allocate time to ensure all your pages get into the index.  There is also the issue of search precision and relevance: I want all that content indexed so I can present the best results to searchers.   

Of course, I can’t just set a time limit based on the number of pages.  What if one site has long pages and another one short, pithy pages (clearly not mine!)?  What if one site has lots of images or other embedded content while another does not?  My algorithm has to be pretty sophisticated to determine these factors on the fly and adapt its baseline timeout settings to new information about a site as it crawls it.

The next algorithm I would include would have to do with the frequency at which you update your data.  The more often you update, the more often I need to have my bot come back and crawl the changed pages on your site. 

Another set of algorithms would have to do with spam.  From the perspective of my limited resource and search precision, I don’t want to include pages in my index that are clearly designed only for the search engines, that are link spammers, or that may only contain PPC ads and have no relevant information for the searcher. 

You get the picture.  I only have a limited window of time to capture continually changing data from the web in order for the data in my index to be reasonably fresh.  Therefore I’ve got to move mountains (of data) in a very short period of time but only so many processing cycles to apply.  And the number of variables I have to control for in my algorithms are numerous and, in many cases, not black and white.

This is where site load times come in.  If a site is large but slow, should it be allocated as much time as it needs to be indexed?  Do I have enough processing cycles to put up with the fact it takes three times as long as a similar site to be crawled?  Is it fair given a scarce resource to allocate time to slow site if it means I can’t index five other better performing sites in my current window of opportunity?  Does it optimize search precision and the relevance of results I can show to searchers?  And last but not least, as one of the guardians of the Web, is poor site performance something I want to encourage from the perspective of user experience and making the Web useful for as many people as possible?  Let’s face it, if the web is really slow, people won’t use it, and the eyeballs that will be available to view an ad from which I stand to make money will be less. 

Hello?  Are you there?  Can you say “zero tolerance?”  And from the perspective of the universal search engines, there is also my favorite radio station – “WIFM.”  What’s In it For Me?  Answer: nothing good.  That is why Google has made page load times a factor in Adwords Quality Score, as an example.

So, in the extreme case (let’s say a page takes 30 seconds to load), the bots won’t crawl most, if any, of the site.  The engines can’t afford the time and don’t want to encourage a poor user experience.  So you are ignored – which means you never get into the indexes.

When Is a Page’s or Site’s Loading Time Considered Slow?

What is an “extreme case?”  I have looked that up and the answer is not a fixed number.  Instead, for Google, the concept of “slow loading” is relative. 

The threshold for a ‘slow-loading’ landing page is the regional average plus three seconds.

The regional average is based on the location of the server hosting your website. If your website is hosted on a server in India, for example, your landing page’s load time will be compared to the average load time in that region of India. This is true even if your website is intended for an audience in the United States.

Two things to note about how we determined the threshold: 

  • We currently calculate load time as the time it takes to download the HTML content of your landing page. HTML load time is typically 10% to 30% of a page’s total load time. A three-second difference from the regional average, therefore, likely indicates a much larger disparity.
  • We measure load time from a very fast internet connection, so most users will experience a slower load time than we do.

Moreover, Google has a sliding scale with which is grades a site.  The following quote applies to Adwords and landing pages, but my guess is similar algorithms and grading are used in determining how often and long a site is crawled:

A keyword’s load time grade is based on the average load time of the landing pages in the ad group and of any landing pages in the rest of the account with the same domain. If multiple ad groups have landing pages with the same domain, therefore, the keywords in all these ad groups will have identical load time grades.

Two things to note:

  • When determining load time grade, the AdWords system follows destination URLs at both the ad and keyword level and evaluates the final landing page.
  • If your ad group contains landing pages with different domains, the keywords’ load time grades will be based on the domain with the slowest load time. All the keywords in an ad group will always have the same load time grade.

We’ll stop here for today.  Next time, we’ll talk about happens in the nether regions between fast and clearly slow.


One Response to “Technical SEO: Introduction to Site Load Times and Natural Search Rankings”

Posts By Date
October 2009
« Sep   Nov »