About Online Matters

Archive for the ‘Google’ Category

PostHeaderIcon Notes from First Day of SMX Advanced 2010

Back from SMX Advanced London, where I got a chance to speak on “SEO, Search, and Reputation Management and SMX Advanced 2010 in Seattle, where I got to relax and just take in the knowledge.

So here for all who could not attend, is a summary of three of the sessions I attended on the first day of SMX Advanced 2010.  I only get so much time to blog…working guy you know.  I’ll do my best to post the rest, but no promises.

SEO for Google versus Bing

Janet Miller, Searchmojo

  • From heatmap studies, it appears people “see” Bing and Google SERPs in pretty much the same way.  The “hotspots” are pretty similar.
  • Not surprising: average pages/visit and time on site are higher for Bing than Google – but that has always been true from my perspective
  • Bing does not currently accept video or news sitemaps.
  • On Google you can edit sitelinks in Webmaster tools, in Bing you cannot.
  • Geolocation results show pretty much the same in both sets of results.
  • One major difference:  Google shopping is free for ecommerce sites to submit; Bing only has a paid option for now.
  • Bing lets you to share results (social sharing) on Facebook, Twitter, and email, Google does not.  But the sharing links point back to the images on Bing, not to the original images on your site.  You also have to grant access to Bing on Facebook.
  • Bing allows “document preview” when you rollover the entry.  It will also play videos in preview mode – but only those on youTube.  If you look at the behavior, information from the page shows up.  To optimize the presentation of that information, Bing takes information in this order:
    • H1 tag first – if title tag and h1 tag don’t match, it takes the H1 tag
    • First paragraphs of information
    • To add contact info, add that information to that page.  Bing is really good about recognizing contact information that is on a page.
      • Address
      • Phone
      • Email
      • To disable “document preview” enter the following
        • Add this meta tag to the page: <meta name=“msnbot”, content=“nopreview”>
        • Or add this line to robots.txt: x-robots-tag: nopreview

      Rand Fishkin: Ranking Factor Correlations: Google versus Bing

      As usual, Rand brought his array of statistical knowledge to bear to compare how Bing and Google react to different ranking signals.  Here are the takeaways:

      Overall Summary of Correlations with Ranking, in Order of Importance

      Bing Google
      1. Number of linking root domains
      2. An exact match of .com domain name with desired keyword
      3. Linking domains with an exact match in the TLD name
      4. Any exact match of the domain name with the desired keyword
      5. Number of inbound links
      1. An exact match of .com domain name with desired keyword
      2. Linking domains with an exact match in the TLD name
      3. Number of linking root domains
      4. Any exact match of the domain name with the desired keyword
      5. Number of inbound links

      Domain Names as Ranking Factors

      • Exact match domains remain powerful ranking signals in both engines (anchor text could be a factor, too).
      • Hyphenated versions of domain names are less powerful, though when they show they show more  frequently (more times on a page)  in Bing (G: 271 vs. B: 890).
      • Just having keywords in the domain name has substantial positive correlation with high rankings.
      • If you really want to rank on a keyword, make sure you get exactmatchname.com as the TLD.
      • Other exact match domains may still help, but don’t have as high correlation.
      • Keywords  in subdomains are not nearly as powerful as in root domain name (no surprise).
      • Bing may be rewarding subdomain keywords less than before (though G: 673 vs. B: 1394).
      • On alternate TLD extensions:
        • Bing appears to give substantially more weight to these than Google.
        • Matt Cutts’ claim that Google does not differentiate between .gov, .info and .edu appears accurate.
        • The .org TLD has a surprisingly high correlation with high rankings  but you can attribute this to elements of their authority – more links, more non-commercial links, Less spam.
        • Don’t forget the exact match data  .com is still probably a very good thing (at least own it).
        • Shorter URLs are likely a good best practice (especially on Bing).
        • Long domains may not be ideal, but aren’t awful.

      On-Page Keyword Usage

      • Google rankings seem to be much more highly correlated with on-page keyword usage than for Bing.
      • The alt attribute of images shows significant correlation as an on-page ranking factor. (I always thought so and it’s one of the elements most SEO newbies miss.)
      • Putting keywords  in URLs is likely a best practice.
      • Everyone optimizes titles (G: 11,115 vs. B: 11,143).  Differentiating here is hard.
      • (Simplistic) on-page optimization isn’t a huge factor.
      • Raw content length (length of page and number of times the keyword is mentioned on the page) seems to have only a marginal correlation with rankings.

      Link Counts and Link Diversity

      • Links are likely still a major part of the algorithms, with Bing having a slightly higher correlation.
      • Bing may be slightly more naïve in their usage of link data than Google, but better than before.
      • Diversity of link sources remains more important than raw link quantity.
      • Many anchor text links from the same domain likely don’t add much value.
      • Anchor text links from diverse domains, however, appears highly correlated.
      • Bing seems more Google-like than in the past in handling exact match anchor links (this is a surprise!).

      Home Pages

      • Bing’s stereotype holds true: homepages are more favored in top results vs. Google.

      Twitter, Real-Time Search, and Real-Time SEO

      Steve Langville – Mint.com

      Steve had a lot of interesting points, and I thought his approach to real-time was one of the most sophisticated I had heard.

      1. One element of his strategy is what I like to call “Merchandising Real-Time Search.”    Basically someone at Mint has a merchandising calendar of important dates/topics in consumers financial lives (e.g. tax time) and also watches for hot topics that could impact a consumers sense of money (e.g. new credit card legislation).  Mint then has a team that can create new content on that topic that is likely to generate word-of-mouth.  At that point, they push the content out and then energize their communities on Facebook, Twitter, etc. by promoting the content to them.  This generates buzz and visits back to mint.com.
      2. Mint has also created Mint Answers, it’s own Yahoo Answers-like site where people ask and answer questions on financial topics.  The result is a lot of user generated content on Mint.com on critical keywords that yields high ranking in the SERPs.
      3. Mint also developed as Twitter aggregator widget around personal finance and put this as a section on their site.  Twitter’s community managers then retweeted these folks who then signed up for @mint and began retweeting @mint tweets.  According to Steve, the amplification effect was huge.

      Danny Sullivan

      As always, Danny had some really interesting insights to add about real-time search.  I will honestly say that many times I still think Danny, like many search marketers, thinks “transactionally” about search , as compared to consumer marketers who think about having an on-going “conversation” with a customer.  (More on that notion later).  But in this case, Danny really showed why he is known as an industry visionary:

      • Search marketing means being visible wherever someone has overtly expressed a need or desire.  It is more than web; more than keywords.  An example is mobile apps –  search by another name- so I guess he agrees with Steve Jobs on that one.
      • This was uniquely insightful. Whereas normal search is a many-to-many platform where anonymous individuals post  content whose authority grows based on “good” links that are added over time, real-time search is a one-to-one platform where clearly identified people post questions or comments  and get responses.  Authority comes from the level of active engagement, not links.  I had never heard real-time described this way, and it is a succinct but very sophisticated definition of real-time search.
      • You can use conversations to identify folks interested in what you need. Not a new concept, but good to repeat.  So if you have a service that sells vacuum cleaners, search for “anyone know vacuum cleaners” and the folks who have an interest are now identified and you can respond to them.
      • Get a gift by giving a gift. That’s the fundamental currency of social media. Danny answered 42 questions from people who didn’t know him, didn’t follow him.  He got no complaints and 10 thank yous.
      • Recency versus Relevancy. Anyone doing real-time gets this – that authority can come from having high-quality information or having reasonably high quality information in a very short time frame – in other words, sometimes the recency of news makes it more worthy of attention than something older but more thought out.  Danny believes that as Twitter matures (and maybe the entire real-time search business – that wasn’t clear), relevancy is going to get a higher relative weighting, so that relevant results will get more hang time in the SERPs.

      Chris Silver-Smith

      I have trouble summarizing all of Chris’s talk – and it was a very good talk – because so much of what he talked about was covered in my notes from other speakers.  So here are the unique points from his chat:

      • You have to decide how you resource Twitter and other sites.  Questions to ask for your strategy
        • Consumers First: What are consumers saying about your site/company already? How might they use your Twitter content? Develop representative Personas of consumers who would engage with you on Twitter.
        • Time/Investment: How much time do you have to devote to Twittering? Do you devote someone to spend time dailyreading/responding to Tweets?
        • Goals: What are some advantageous things you could accomplish by interacting with consumers in real-time?
        • Strategy will decide whether you hire a full-time person, part-time person, or use automation.
        • Use OAuth for API integration as it shows the application the visitor used as an appended data point
        • Convert your Google News feeds to RSS to make them easier to subscribe to by members of your community
        • A great tool for small business social media management is www.closely.com which auto-creates a social action page for every offer a company makes on Twitter and Facebook
        • Be brief but really clear in main point on Tweets. Include a call to action as they are retweeted at a much higher rate.

      John Shehata – Advanced Internet

      I loved John’s presentation because it confirmed many of the same conclusions I had reached about real-time search and reported on at SMX Advanced in London.  Key points:

      • The ranking factors for real-time search are very different. They include:
        • User (author) authority (My comment:  not just one site but across every site  on which the author publishes).
        • How fresh that author’s content continues to be.
        • Number of followers.
        • The quality of follows and how they act on the author’s content (is it retweeted often?  Is it stumbled?  Does someone flow it into their RSS feed?  How often?  How quickly?).
        • URL real-time resolution.
        • It is not about how many followers you have but how reputable (authoritative) your followers are.  (This is what I call Authorank and like PageRank it is passed from authoritative follower to those they follow.)
        • You earn reputation, and then you give reputation. If lots of people follow you, and then you follow someone–then even though this [new person] does not have lots of followers, his tweet is deemed valuable because his followers are themselves followed widely.
        • Other possible ranking factors:
          • Recent Activity : Google pays more attention to accounts with more activity?
          • User name: keywords in your user name might also help.
          • Age: since age plays a big role in Google search engine ranking, it’s possible that more established Twitter accounts will outrank the newer ones.
          • External links: links to your @account from (reputable) non-social media sites should boost reputation as far as Google is concerned.
          • Tweet Quantity: the more you tweet, the better chance you’ve got to be seen in Google real-time search results.
          • Ratios of followed vs follow: a close ratio between the two can raise a red flag.
          • Lists: it might also matter in how many lists you appear.

      Tactics to follow:

      • Encourage retweets by tweeting content of 120 characters or less so you can save room for the RT @ Username that is added when someone passes along your message to their followers.
      • Tools to identify hot trends: Google Hot Trends, Google Insights, Google News, Bing xRank, Surchur, Crowdeye, Oneriot.
      • Same advice as Steve Langville – plan for seasonal keyword trends.
      • Don’t update multiple accounts, reTweet instead.
      • Connect your social profiles.
      • Attract reputable, topically-related followers.
      • Write keyword-rich tweets whenever possible, without sounding spammy:
        • Do not create content with multiple buzzing terms.
        • Do not abuse shortening services for spam links.
        • Do not go overboard using Twitter #hashtags – Search Engines will eliminate your tweet from search if you use too many because it “looks bad.”
        • Spammy looking tweet streams will be eliminated from search.
        • Don’t use same IP address for different twitter accounts.

      Show Me The Links

      This was a great session with a HUGE number of ideas for getting new links.  And each person talked about a very different philosophy towards link building and their tactics reflected those philosophies.  Let’s see if I can capture them:

      Chris Bennett

      • Philosophy centers on using easily created and highly valued visual or viral content:
        • Creating Infographics – they work very well.  An example – a “where does the money go from the 2008 stimulus bill” infographic generated 29,000 links.
        • Writing guest blog posts whose content is highly viral for others .  Embed a link to your site as the source.  You give the gift of traffic to them, you get links as a gift in return.

      Arnie Kuenn

      • More traditional link building
        • 50% is content development  and promotion.  The big example he used on this was the Google April Fools Day Prank about Google opening an SEO Shop.  Got picked up as “real” story by Newswire 27 days after post, went viral, generated 800 backlinks.
        • 20% is blog post and article placement.
        • 10% is basic link development.
        • 20% is targeted link requests to those few critical high-value sites. There are NO magic bullets here – it takes creativity and just good old-fashioned hard work and persistence.  But the rewards can be substantial.

      Gil Reich

      • Use badges with your URL embedded that benefits the person who puts on site (e.g. “a gold star” validation).
      • Write testimonials for other folks.
      • Write on sites that want good content and can deliver an audience.
      • Answer questions on answer sites where you have the expertise.
      • Make it easy to link to you by providing the information to potential linkers.

      Roger Montti

      Focused on B2B link building tactics:

      • Backlink trolling from competitors- but also look for sites that your competitors aren’t on – you want your own authoritative link network.
      • Don’t ignore TLD .us  There are lots of good possible link sites with decent authority there.
      • Look at associations that provide ways to link to their members.  Search for member lists, restrict your search to .org and add in relevant keyword phrases to filter for your related groups.
      • Look at dead sites with broken links – see who is linking to them.  Once you have identified a dead internet page do a linkdomain: search on Yahoo to identify sites still linking to the dead site.
      • Free links from resources, directories, or “where to buy” sites.
      • Bloggers:  cultivate alliances and relationships with other sites and blogs.  Particular bloggers who like to do interviews.

      Debra Masteler

      • You have all this content that you generate as a normal part of your business.  Use it.
        • Use dapper.net to create RSS feeds of your blog content
        • Joost de Valk has a WordPress plugin at http://yoast.com/wordpress/rss-footer/ which let’s you add an extra line of content to articles in your feed, defaulting to”Post from“ and then a link(s) back to your blog,with your blog’s name as it’s anchor text.
        • Use RSS feeds from news sources to identify media leads to speak with as part of your PR work.
        • Content syndication: podcasts, white papers, living stories, news streams and user generated content (e.g. gues blogging) are still hot.  Infographics, short articles, individual blogs, and Wikipedia are not.
        • Widget Bait: basic widgets that you can build on widgetbox are getting somewhat passé but still have some value.   You need to do more advanced versions – information aggregation widgets seem to work very well right now.  Make people come to you to download them.
        • Microsites: the old link wheels are worthless at this point – the engines have figured those out and treat them similarly to link spam sites.  Those with good content – e.g. blogs or sites with good content – work.  One option is to buy an established site and then rebrand it.
FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare

PostHeaderIcon Google TV: TV Advertisers Should Be Mad as Hell

The big announcement at Google I/O last week was the release of details about Google TV.  And it should make TV advertisers of all stripes angry and concerned.  VERY concerned.  So much so, in fact, that they should be actively seeking technologies and business models to deflect/prevent what is effectively an advertising power play by Google.

Google TV is the latest attempt to merge the television experience with a web-based TV (also called IPTV) experience on the television set (as compared to bringing TV to the web, as say a SlingBox does).  There have been numerous attempts to bring the Web to the television, going back all the way to 1996 when Steve Perlman, Bruce Leak and Phil Goldman brought to market the WebTV set-top box, marketed by both Sony and Phillips.  (Find a list of TV/Internet hybrids in the next post).  None of these has been particularly successful, for numerous reasons:

  • Most require an extra set-top box that is expensive (Google is no different.  As an example of technology that uses the consumer’s  computer or laptop as the interface to the TV, see Kylo).
  • The experience doesn’t truly integrate.  You either watch the web-based offerings or Live TV, but not both at the same time.  In many cases, the box is meant for the delivery of movies or TV shows on-demand, as compared to being broadcast in real-time.  The Roku/Netflix platform is an example of this.  PopBox is another example, but they also deliver more content – websites, social media experiences from Facebook and Twitter, images, YouTube videos, games, and music from sources like Photobucket and Pandora.
  • The interface requires a separate remote control, which adds another layer of complexity to the consumer experience.

None of these really impacts the effectiveness of a “single” broadcast TV advertisement in any meaningful way.  They are separate experiences from broadcast television and, as a rule, they do not take away from live TV viewership.  Some amount of consumers’ time is given to the Internet and movies on-demand nowadays.  Whether I interface with that experience through my computer screen or TV screen doesn’t change the amount of time I spend in an “online” mode versus a TV viewing mode, and it does not impact my current behavior around the TV ads themselves.

Google TV has come up with a different approach which, at least during an initial search, overlays the Internet on top of the television experience (see first image).  When it overlays, the interface is transparent so you can see your TV behind the browser interface that lets you search for the shows and information you want.

Google TV transparent interface

There are other times when the interface switches completely and the TV experience is put on hold while the viewer interacts with Web content (see second image), which is more like the experiences of the current generation of web-to-TV offerings.  But the difference here is how easy and seamless Google TV makes it to switch between all three of these user experiences – live TV, TV in the background, and Internet-only.  The other difference, and one of critical import for this article, is that Google intends to sell advertising within the Google TV platform.  Where, how, and how much are still to be determined.

Google TV solid menu interface

Google has definitely come up with something unique that I believe will be very compelling to television viewers as it now truly integrates the television and web experiences for the first time.

But if the consumer loves this, traditional TV advertisers should hate it.

Today, television advertising is a $70B market versus $25B for Internet and mobile search advertising.

US Advertising Market Revenues 2009

In 2010, TV advertising is projected to grow 4.3%, or $3 billion on a base of $70.2 billion. This compares to non-search on-line advertising which is projected to grow 12.9% or $1.6 billion on a base of $12.2 billion during this period.   So even though Internet advertising is growing at a faster rate, television advertising real-dollar growth is twice that of Internet advertising.

Google knows this.  Rishi Chandra, the  product manager for Google TV, mentioned the $70B factoid at Google’s I/O conference last week.  Moreover, Google also gets that despite the fact consumers are spending an increasing number of hours per day online,  television viewership is at an all-time high, with 180mm US consumers watching TV for over 5 hours/day on average.  Rishi mentioned this, as well. The guys at the Googleplex are no dummies. As the old saying goes, they can see a mountain in time. In this case, the mountain they want to cash in on is TV advertising.

What both Google and the advertisers also know is that television advertising is broken.  In 1987 an advertiser could reach 80% of viewers by airing a 30-second spot only three times. Today, that same commercial would have to air 150 times to reach 80% of viewers.[1] The rapid decline of TV ad viewership is due to the “TiVo” effect and today’s viewers’ multi-tasking habits – texting, phoning, emailing and web surfing while watching TV.  Brands are urgently seeking a solution to reengage viewers of their TV commercials as brand expenditures on television dwarf what they spend on all other ad mediums.

Now Google would argue that it has found the solution, and from its perspective I truly think they believe this.  The Google culture is driven by data and metrics.  Current television advertising with its lack of performance measurement is anathema to a Googler’s mindset.   If you are a fanatic about data-driven marketing, Google TV “solves” this problem because of its ability to bring CPC and other easily measurable formats into the web-based part of the new integrated television experience.

Interesting and correct as far as it goes.  But wrong – and I mean dead wrong – from the perspective of television advertisers who focus 3x of their dollars on television versus online advertising because it is still the most potent means of getting a message to the consumer.  Moreover, it is a power play by Google to disconnect the brand advertisers from their traditional advertising providers and drive them, willingly and like lemmings, onto the Google platform(s), thus providing Google an even stronger power position relative to advertisers.

Let’s think about this.  Television advertising is already much less effective than it used to be.  Now along comes Google TV with its overlay and ability to seamlessly move away from the live television experience.  Let’s say you are a viewer watching Lost, that you are using Google TV, and you have left your laptop in the other room because – heck – you don’t need a two-screen solution to access the Internet during live television now that you have Google TV.  Something on the show triggers you to want to look up some factoid on the web at a Lost fan site.  You plan to type in “Lost fan site.”

When are you going to type this in?  During the time the episode is airing?  Absolutely not.  You’re not going to want to miss one minute because Hurley is about to tell Jack his real name.    Or take another example – a sports case.  Are you going to put the potential touchdown play in background mode while you look up Brett Favre’s completion percentage in third down and long situations?  Absolutely, positively not, to the extent that the sports fan is thinking “don’t you dare touch the remote or there will be one less thumb in this family.”

No.  You are going to switch to the Internet experience when the television ads come on and you can safely move away from the live broadcast to find what you need before your show comes back on.

There is another interesting fact that only makes this seem a more likely behavior on the part of cross-platform TV viewers, at least the early adopters of Google TV.  In a recent study of US Online TV Viewership by Comscore involving 1,800 subjects, a majority (67 percent) of cross-platform (TV and online) viewers preferred online TV viewing because it has less interference from commercials[2].  Since these folks are the likely early adopters of Google TV, the tendency to move away from live TV during commercials will be very strong.

So what does Google TV do?  It makes the television ad spend of the major brands even less effective than currentlyBecause Google TV still provides an interruptive experience, it actually encourages cross-platform viewers who wish to increase the “information content” of their viewing experience from web-based channels to do so at the exact time that advertisers least want them to do so.

There is an even bigger implication of this for brand advertisers.  In order to keep the cross-platform consumer’s attention as they move away from viewing television ads, the brand advertisers will be forced to place their ads on the web-based portion of the Google TV interface.  And to a certain extent this makes sense going back to our previous point about measurability of TV advertising.  The Google platform is measurable and consumers more and more are becoming habituated to interacting with web-based CPC or banner advertising.  So the TV advertiser keeps the attention of the cross-platform viewer during the commercial break in the show and gets better metrics.  It’s an obvious win-win for both Google and the advertiser, and a very seductive business proposition to marketing executives looking for better measurability around TV advertising.

But for TV advertisers, Google TV is the equivalent of the poison apple given to Sleeping Beauty.  As Google TV penetrates households, more and more TV viewers will become habituated to the dual-use experience and will spend more and more time on the Google platform during broadcast television advertising pods.  And despite the fact I haven’t said much about mobile in this article until now, Google TV will also move onto the mobile platform and will provide an even more integrated experience for the consumer across the two screens, with a whole host of implications for the two-screen experience that I won’t discuss here.  Given the timing of historical consumer behavior transitions in the television market, this could take ten years. But over that time, Google will take a larger and larger share of the currently $70B TV advertising and the $2.7B mobile advertising markets.  This means that as much as an advertiser is currently dependent on Google for web advertising, they will become even more dependent on the single provider that is Google because of its reach in these other channels.

If you as an advertiser aren’t concerned about the implications of this for your business, where Google can set effectively monopoly prices you pay for ads across every major advertising platform you have, you should be.  You should be very concerned and mad as hell at this attempt to manipulate your advertising dollars even further into the maw of the machine that Google has become.

If I were a brand advertiser right now, I would be talking to my peers and looking for a second-platform solution from someone that can constrain this power play by Google before it becomes a fait accomplice.  If I were Yahoo or Microsoft, I’d be developing or investing in a prototype of something I could show to brand advertisers today and get them to invest strategically in order to prevent Google from locking up this market before it is too late.


[1] “Advertising is Dead, Long Live Advertising” Himpe, 2008

[2] Yuki, Tania “Comscore Study of US Online TV Viewership.” http://www.comscore.com/Press_Events/Press_Releases/2010/4/Viewers_Indicate_Higher_Tolerance_for_Advertising_Messaging_while_Watching_Online_TV_Episodes

FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare

PostHeaderIcon Technical SEO: Site Loading Times and SEO Rankings Part 2

In my last post, I discussed the underlying issues regarding site loading times and SEO rankings.  What I tried to do was help the reader understand why site loading times are important from the perspective of someone designing a search engine that has to crawl billions of pages.  The post also outlines a few of the structures that they would have to put in place to accurately and effectively crawl all the pages they need in a limited time with limited processing power.  I also tried to show that a search engine like Google has a political and economic agenda in ensuring fast sites, not just a technical agenda.  Google wants as many people/eyeballs on the web as possible, so it is to their advantage to ensure that web sites provide a good user experience.  As a result, they feel quite justified in penalizing sites that do not have good speed/performance characteristics.

As you would expect, the conclusion is that if your site is hugely slow you will not get indexed and will not rank in the SERPs.  What is “hugely slow”?  Google has indicated that slow is a relative notion and is determined based on the loading times typical of sites in your geographical region.  Having said that, relative or not, from an SEO perspective I wouldn’t want to have a site where pages are taking more than 10 seconds on average to load.  We have found from the sites we have tested and built that average load times higher than approximately 10 seconds to completely load a page will have a significant impact on being indexed.  From a UE perspective, there is some interesting data that the limit on visitors patience is about 6-8 secondsGoogle has studied this data, so it would probably prefer to set its threshhold in that region.  But I doubt it can.   Many small sites are not that sophisticated, do not know these kinds of rules, and do not know how to check or evaluate their site loading times.  Besides this, there are often problems with hosts that cause servers to run slowly at times.  Google has to take that into account, as well.  So I believe that the timeout has to be substantially higher than 6-8 seconds, but 10 seconds as a crawl limit is a guess, 

I have yet to see a definitive statement by anyone as to what the absolute limit is for site speed before indexing ceases altogether (if you have a reference, please post it in the comments).  I’m sure that if a bot comes to a first page and it exceeds the bot’s timeout threshold in the algorithm, your site won’t get spidered at all.  But once the bot gets by the first page, it has to do an on-going computation of average page loading times for the site to determine if the average exceeds the built-in threshold, so at least a few pages would have to be crawled in that case. 

Now here’s where it gets interesting.  What happens between fast (let’s say < 1-2 second loading times, although this is actually pretty slow but a number Matt Cutts in the video below indicates is ok) and the timeout limit?  And how important is site speed as a ranking signal?  Let’s answer one question at a time.

When a site is slow but not slow enough to hit any built-in timeout limits (not tied to the number of pages), a couple of things can happen.   We do know that Google allocates bot time by the number of pages on the site and the number of pages it has to index/re-index.  So for a small site that performs poorly, it is likely that most of the pages will get indexed.  Likely, but not a guarantee.  It all depends on the cumulative time lag versus the average that a site creates. If a site is large, then you can almost guarantee that some pages will not be indexed, as the cumulative time lag will ultimately hit the threshold set by the bots for a site of that number of pages. By definition, some of your content will not get ranked and you will not get the benefit of that content in your rankings.

As an aside, by the way, there has been a lot of confusion around the <meta name=”revisit-after”> tag.  The revisit-after meta tag takes this form <meta name=”revisit-after” content=”5 days”>. 
This tag supposedly tells the bots how often to come back to the site to reindex this specific page (in this case 5 days).  The idea is that you can improve the crawlability of your site by telling the bots not to index certain pages all the time, but only some of the time.  I became aware of this tag at SMX East, when one of the “authorities” on SEO mentioned it as usable for this purpose.  The trouble is that, from everything I have read, the tag is completely unsupported by any of the major engines, and was only supported by one tiny search engine (SearchBC)  many years ago. 

But let’s say you are one of the lucky sites where the site runs slowly but all the pages do get indexed.  Do Google or any of the other major search engines use the site’s performance as a ranking signal?  In other words, all my pages are in the index.  So you would expect that they would be ranked based on the quality of their content and their authority derived from inbound links, site visits, time-on-site, and other typical ranking signals.  Performance is not a likely candidate for a ranking signal and isn’t important. 

If you thought that, then you were wrong. Historically, Google has said, and Matt Cutts reiterates this in the video below, that site load times do not influence search rankings.  But while that may be true now, it may not be in the near future.  And this is where Maile’s comments took me by surprise.  In a small group session at SMX East 2009, Maile was asked about site performance and rankings.  She indicated that for the “middle ground” sites that are indexing but loading slowly, site performance may already be used to influence rankings.  Who is right, I can’t say.  These are both highly respected professionals who choose their words carefully. 

 

 

 

Whatever is true, Google is sending us signals that this change is coming.  Senior experts like Matt and Maile don’t say these things lightly.  They are well considered and probably approved positions that they are asked to take.  This is Google’s way of preventing us from getting mad when the change occurs.  Google has the fallback of saying “we warned you this could happen.”  Which from today’s viewpoiint means it will happen.

Conclusion: Start working on your site performance now, as it will be important for SEO rankings later. 

Oh and, by the way, your user experience will just happen to be better, which is clearly the real reason to fix site performance. 

And it isn’t only Google that may make this change.  Engineers from Yahoo! recently filed a patent with the title “Web Document User Experience Characterization Methods and Systems” which bears on this topic.  Let me quote paragraph 21:

With so many websites and web pages being available and with varying hardware and software configurations, it may be beneficial to identify which web documents may lead to a desired user experience and which may not lead to a desired user experience. By way of example but not limitation, in certain situations it may be beneficial to determine (e.g., classify, rank, characterize) which web documents may not meet performance or other user experience expectations if selected by the user. Such performance may, for example, be affected by server, network, client, file, and/or like processes and/or the software, firmware, and/or hardware resources associated therewith. Once web documents are identified in this manner the resulting user experience information may, for example, be considered when generating the search results.

In does not appear Yahoo! has implemented any aspect of this patent yet, and who knows what the Bing agreement will mean for site performance and search.  But clearly this is a “problem” that the search engine muftis have set their eyes on and I would expect that if Google does implement it, others will follow.

FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare

PostHeaderIcon Matt Cutts, Nofollow, and the Consistently Inconsistent

I have avoided (like the plague) weighing in on the tempest Matt Cutts unleashed at SMX Advanced in June regarding Google’s change to the use of the <nofollow> tag for PageRank sculpting.  I have avoided it for two reasons:

  1. In my mind, more has been made of it than its true impact on people’s rankings.
  2.  

  3. As far as I’m concerned, in general (and note those two words) the use of the <nofollow> tag is a last resort and a crutch for less than optimal internal cross-linking around thematic clusters.  When internal cross-linking is done right, I don’t believe the use of the <no follow> tag is that impactful.

Bruce Clay had a great show on Webmaster Radio on the subject of the <nofollow> controversy, and basically he was of the same opinion as me. There are also many more heavyweights who have weighed in than I care to name.  So adding my comments to the mix isn’t all that helpful to my readers or the SEO community generally.

But I was searching today for some help on undoing 301 redirects when I found this section on the SEOMoz blog (click here for the whole article) from 2007 that provides some historical context for these conversations – so I thought I’d share it here.  My compliments to Rand Fiskin of SEOMoz for reproduction of this content:

“2.Does Google recommend the use of nofollow internally as a positive method for controlling the flow of internal link love?

A) Yes – webmasters can feel free to use nofollow internally to help tell Googlebot which pages they want to receive link juice from other pages

(Matt’s precise words were: The nofollow attribute is just a mechanism that gives webmasters the ability to modify PageRank flow at link-level granularity. Plenty of other mechanisms would also work (e.g. a link through a page that is robot.txt’ed out), but nofollow on individual links is simpler for some folks to use. There’s no stigma to using nofollow, even on your own internal links; for Google, nofollow’ed links are dropped out of our link graph; we don’t even use such links for discovery. By the way, the nofollow meta tag does that same thing, but at a page level.)

B) Sometimes – we don’t generally encourage this behavior, but if you’re linking to user-generated content pages on your site who’s content you may not trust, nofollow is a way to tell us that.

C) No – nofollow is intended to say “I don’t editorially vouch for the source of this link.” If you’re placing un-trustworthy content on your site, that can hurt you whether you use nofollow to link to those pages or not.”

Just some interesting background as you consider the current debate.

Reblog this post [with Zemanta]
FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare

PostHeaderIcon Google's Orion and Vincent

Well, even as I really want to write about Twitter and the English language, along comes Google with a new update.  Given the nature of social media, timeliness comes before etymological Godliness (when will you ever see those two words combined in a blog… I think I deserve an award for that one).  Therefore like any young techie in Spring, I turn my thoghtes (thank you Mr. Chaucer) and my feet towards a pilgrimage to the Googleplex. 

But of course, I don’t want to ignore the previous Vincent update – as that was the connection to post #1.

Orion first.  Actually Google did not announce “Orion” – which is a search technology it purchased in 2006, along with it’s college-student developer Ori Allon.  But my guess is that thanks to Greg Sterling’s new article containing that title the term “Orion Release” will stick.  Here’s how Danny Sullivan described the technology back in April 2006:

It sounds like Allon mainly developed an algorithm useful in pulling out better summaries of web pages. In other words, if you did a search, you’d be likely to get back extracted sections of pages most relevant to your query.

Ori himself wrote the following in his press release:

Orion finds pages where the content is about a topic strongly related to the key word. It then returns a section of the page, and lists other topics related to the key word so the user can pick the most relevant.

Google actually announced two changes:

Longer Snippets.  When users input queries of more than three words,  the Google results will now contain more lines of text in order to provide more information and context.   As a reminder, a snippet is a search result that starts with a dark blue title and is followed by a few lines of text.  Google’s research must have shown that regular-length snippets were not providing enough information to searchers to provide a clear preference for a result based on their longer search term – as their stated intent is to provide enhanced information that will improve the searcher’s ability to determine the relevance of items listed in the SERPs.

Having said this, I don’t see any difference.  My slav…. I mean my 12-yo son (who has been doing keyword analysis since he was 10, so no slouch at this) ran ten tests on Google to see if we could find a difference (I won’t detail all the one- and two- vs 3+ word combinations we tried – if you want to have the list, leave a comment or send a twitter to arthurofsun and I will forward it to you).  But shown below are the results for France Travel vs France Travel Guides for Northern France:

 Comparison of Two-Word and 3+ Word Search in Google Orion Release

Comparison of Two-Word and 3+ Word Search in Google Orion Release

 

As you can see, there is absolutely no difference in snippet length for the two searches - and this was universally true across all the searches we ran.    So I’m not sure – I wonder if Ori Allon, who wrote the post, could help us out on this one.

Also, I am somewhat confused.  If you type in more keywords, the search engine has more information by which to determine the relevance of  a result.  So why would I need more information?  Where I need more information is in the situation of a 3- keyword search, which will return a broad set of results that I will need to filter based on the information contained in a longer snippet.

Enhanced Search Associations.  The bigger enhancement – and the one that seems most likely to derive from the original Orion technology – are enhanced associations between keywords.  Basically if you type in a keyword – Ori uses the example  ”principles of physics” – then the new algorithms understand that there are other ideas related to this I may be interested in, like “Big Bang” or “Special Relativity.”  The way Google has implemented this is to put a set of related keywords at the bottom of the first SERP, which you may click on.  When you click, it returns a new set of search results based on the keyword you clicked.  Why at the bottom of the first SERP?  My hypothesis would be that if the searcher has gone to the bottom of the page, it means that they haven’t found what they are looking for.  So this is the right place in the user experience to prompt them with related keywords that they may find more relevant to the content they are seeking. 

From my perspective, this feels like the “People who liked this item also bought…” widget on most comparison shopping sites (which I know something about, having been the head of marketing for SHOP.COM.)  I’m not saying there is anything wrong with this – I’m just trying to make an analogy to the type of user experience Google is trying to create.

Shown below is an example of a enhanced search associations from a search on the broad term “credit derivatives in the USA”:

End of First SERP Showing Google's New Enhanced Association Results

End of First SERP Showing Google's New Enhanced Association Results

 As I expected, the term “credit default swaps” – which is the major form of credit derivative – shows as an associated keyword.  What I do not see in the list – and was surprised  – was any reference to the International Swaps and Derivatives Association (ISDA), which is the organization that has developed the standards and rules by which most derivatives are created.  It does, however, show up for the search on the keyword “credit default swap.”  I’d be curious to understand just exactly how the algorithm has been tuned to make trade-offs between broad concepts (i.e, credit derivatives, which is a category)) and very focused concepts (i.e. credit default swap, which is a specific product).  Maybe I can get Ori to opine on that as well, but most likely that comes under the category of secret sauce.

Anyway, fascinating and it certainly shows that Google continues to evolve the state of IR. 

Well, I’ll just have to leave the Vincent release until tomorrow.  Something else happened this morning I need to do a quick entry about.  Sigh…..

FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare
Posts By Date
July 2014
M T W T F S S
« Jul    
 123456
78910111213
14151617181920
21222324252627
28293031