About Online Matters

Archive for the ‘Information Retrieval’ Category

PostHeaderIcon Why Search Engine Optimization Matters

Yesterday, a reasonably well-known blogger, Derek Powazek, (whose article,  against my strongest desire to give it any further validation in the search engine rankings where this article now ranks #10, gets a link here because at the end of the day the Web is about transparency and the I truly believe that any argument must win out in the realm of ideas) let out a rant against the entire SEO industry.  The article, and the responses both on his website and on SearchEngineLand upset me hugely for a number of reasons:

  1. The tone was so angry and demeaning.  As I get older (and I hope wiser), I want to speak in a way that bridges differences and heals breaches, not stokes the fire of discord.
  2. I believe the tone was angry in order to evoke strong responses in order to build links in order to rank high in the search engines.  Linkbuilding is a tried-and-true, legitimate SEO practice and so invalidates the entire argument Derek makes that understanding and implementing a well thought-out SEO program is so much flim-flam. Even more important to me, do we need to communicate in angry rants in order to get attention in this information and message-overwhelmed universe?  Is that what we’ve come to?  I sure hope not.
  3. The article’s advice about user experience coming first was right (and has my 100% agreement).  But it’s assumptions about SEO and therefore its conclusions were incorrect.
  4. The article’s erroneous conclusions will hurt a number of people who could benefit from good SEO advice.  THAT is probably the thing that saddens me most – it will send people off in a direction that will hurt them and their businesses substantially.  Good SEO is not a game.  It has business implications and by giving bad advice, Derek is potentially costing a lot of good people money that they need to feed their families in these tough times.
  5. The number of responses in agreement with his blog was overwhelming relative to the number that did not agree.  That also bothered me – that the perception of our industry is such that so many people feel our work does not serve a legitimate purpose.
  6. The comments on Danny Sullivan’s response to Derek were few, but they were also pro-SEO (of course).  Which means that the two communities represented in these articles aren’t talking to each other in any meaningful way.  You agree with Derek, comment to him.  You agree with Danny, comment there.  Like attracts like, but it doesn’t ultimately yield to two communities bridging their difference.

I, too, started to make comments on both sites.  But my comments rambled (another one of those prerogatives I maintain in this 140 character world) , and so it became apparent that I would need to create a blog entry to respond to the article – which I truly do not want to do because, frankly, I really don’t want to "raise the volume" of this disagreement between SEO believers and SEO heretics.  But I have some things to say that no one else is saying, and it goes to the heart of the debate on why SEO IS important and is absolutely not the same thing as a good user experience of web development.

So to Danny, to Derek, and to all the folks who have entered this debate, I  hope you find my comments below useful and, if not, my humble apologies for wasting your valuable time.

Good site design is about the user experience. I started my career in online and software UE design when that term was an oxymoron.  My first consulting company, started in 1992, was inspired by David Kelley, my advisor at Stanford, CEO of IDEO (one of the top design firms in the world),  and now founder and head of the Stanford School of Design.  I was complaining to David about the horrible state of user interfaces in software and that we needed an industry initiative to wake people.  His response was "If it’s that bad, go start a company to fix it."  Which I did.  That company built several products that won awards for their innovative user experience. 

That history, I hope, gives credibility to next next statement: I have always believed, and will always believe, that good site experience trumps anything else you do.  Design the site for your customer first.  Create a "natural" conversation with them as they flow through the site and you will keep loyal customers.

Having said that, universal search engines do not "think" like human beings.  They are neither as fast or as capable of understanding loosely organized data.  They work according to algorithms that attempt to mimic how we think, but they are a long way from actually achieving it.  These algorithms, as well as the underlying structures used to make them effective, also must run in an environment of limited processing power (even with all of Google’s server farms) relative to the volume of information, so they have also made trade-offs between accuracy and speed.  Examples of these structures are biword indices and positional indices.  I could go into the whole theory of Information architecture, but leave it to say that a universal search engine needs help in interpreting content in order to determine relevance. 

Meta data is one area that has evolved to help the engines do this.  So, first and foremost, by expecting this information, the search engines expect and need us to include data especially for them that has nothing to do with the end user experience and everything with being found relevant and precise.  This is the simplest form of SEO.  There are two points here:

  1. Who is going to decide what content goes into these tags? Those responsible for the user experience?  I think not.  The web developers? Absolutely positively not.  It is marketing and those who position the business who make these decisions.
  2. But how does marketing know how a search engine thinks?  Most do not.  And there are real questions of expertise here, albeit for this simple example, small ones that marketers can (and are) learning.  What words should I use for the search engines to consider a page relevant that then go into the meta data?  For each meta data field, what is the best structure for the information?  How many marketers, for example, know that a title tag should only be 65 characters long, or that a description tag needs to be limited to 150 characters, that the words in anchor text are a critical signaling factor to the search engines, or that alt-text on an image can help a search engine understand the relevance of a page to a specific keyword/search?  How many know the data from the SEOMoz Survey of SEO Ranking Factors showing that the best place to put that keyword in a title tag for search engine relevance is in first position, and that the relevance drops off in an exponential manner the further back in the title the keyword sits?  On this last point, there isn’t one client who hasn’t asked me for advice.  They don’t and can’t track the industry and changes in the algorithms closely enough to follow this.  They need SEO experts to help them – a member of the trained and experienced professionals in the SEO industry, and this is just the simplest of SEO issues.

How about navigation?  If you do not build good navigational elements into deeper areas of the site (especially large sites) that are specifically for search engines and/or you build it in a way that a search engine can’t follow (e.g. by the use of Javascript in the headers or flash in a single navigation mechanism throughout the site), then the content won’t get indexed and the searcher won’t find it.  Why are good search-specific navigational elements so important?  It comes back to limited processing power and time.  Each search engine has only so much time and power to crawl the billions of pages on the web, numbers that grow every day and where existing pages can change not just every day but every minute.  These engines set rules about how much time they will spend crawling a site and if your site is too hard to crawl or too slow, many pages will not make it into the indices and the searcher, once again, will never find what could be hugely relevant content.

Do UE designers or web developers understand these rules at a high level?  Many now know not to use Javascript in the headers, to be careful how they use flash and, if they do use it in the navigation, to have alternate navigational elements that help the bots crawl the site quickly.  Is this about user experience?  Only indirectly.  It is absolutely positively about search engine optimization, however, and it is absolutely valid in terms of assuring that relevant content gets put in front of a searcher.

Do UE designers or web developers understand the gotchas with these rules?  Unlikely.  Most work in one organization with one site (or a limited number of sites).  They haven’t seen the actual results of good and bad navigation across 20 or 50 or 100 sites and learned from hard experience what is a best practice.  They need an SEO expert, someone from the SEO  industry, to help guide them.  

Now let’s talk about algorithms.  Algorithms, as previously mentioned, are an attempt (and a crude one based on our current understanding of search) at mimicking how searchers (or with personalization a single searcher) think so that searches return relevant results to that searcher.  If you write just for people, and structure your pages just for readers, you are doing your customers a disservice because what a human can understand as relevant and what a search engine can grasp of meaning and relevance are not the same.  You might write great content for people on the site, but if a search engine can’t understand its relevance, a searcher who cares about that content will never find it. 

Does that mean you sacrifice the user experience to poor writing?  Absolutely, positively, without qualification not.  But within the structure of good writing and a good user experience, you can design a page that helps/signals the search engines, with their limited time and ability to understand content, what keywords are relevant to that page. 

Artificial constraint, you say? How is that different than the constraints I have when trying to get my message across with a good user experience in a data sheet?  How is that different when I have 15 minutes to get a story across in a presentation to my executive staff in a way that is user friendly and clear in its messaging?  Every format, every channel for marketing has constraints.  The marketer’s (not the UE designer’s and not the web developer’s) job is to communicate effectively within those constraints. 

Does a UE designer or the web developer understand how content is weighted to create a ranking score for a specific keyword within a specific search engine?  Do they know how position on the page relates to how the engines consider relevance? Do they understand how page length effects the weighting?  Take this example.  If I have two pages, one of which contains two exact copies of the content on the first page, which is more relevant?  From a search engine’s perspective they are equally relevant, but if a search engine just counted all the words on the second page, it would rank higher.  A fix is needed.

One way that many search engines compensate for page length differences is through something called pivoted document length normalization (write me if you want a further explanation).  How do I know this?  Because I am a search engine professional who spends time every day learning his trade, reading on information architecture and studying the patents filed by the major search engines to understand how the technology of search can or may be evolving.  Because – since I can’t know exactly what algorithms are currently being used –  I run tests on real sites to see the impact of various content elements on ranking.  Because I do competitive analysis on other industry sites to see what legitimate, white hat techniques they have used and content they have created (e.g. videos on a youtube channel that then point to their main site) to signal the relevance of their content to the search engines. 

And to Derek’s point, what happens when the algorithms change?  Who is there watching the landscape for any change, like an Indian scout in a hunting party looking for the herd of buffalo?  Who can help interpret the change and provide guidance on how to adapt content to maintain the best signals of relevance for a keyword to the search engines?  Derek makes this sound like an impossible task and a lot of hocus-pocus.  It isn’t and it’s not.  Professional SEO consultants do this for their clients all the time, by providing good maintenance services.  They help their clients content remain relevant, and hopefully ranking high in the SERPs, in the face of constant change.

So to ask again, do UE designers or product managers understand these issues around content?  At some high level they may (a lot don’t).  Do web developers? Maybe, but most don’t because they don’t deal in content – it is just filler that the code has to deal with (it could be lorem ipsum for their purposes).  Do any of these folks in their day-to-day struggles to do their jobs under tight time constraints have the time to spend, as I do, learning and understanding these subtleties or running tests? Absolutely, positively not.  They need an SEO professional to counsel them so that they make the right design, content and development choices.

I’ll stop here.  I pray I’ve made my point calmly and with a reasoned argument.  Please let me know.  I’m not Danny Sullivan, Vanessa Fox, Rand Fishkin, or Stephan Spencer, to name a few of our industry’s leading lights.  I’m just a humble SEO professional who adores his job and wants to help his clients rank well with their relevant business information.  My clients seem to like me and respect what I do, and that gives me an incredible amount of satisfaction and joy. 

I’m sorry Derek, but I respect your viewpoint and I know that you truly believe what you are saying.  But as an honest, hard-working SEO professional, I couldn’t disagree with you more.

FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare

PostHeaderIcon What is Influence and Why is It?

I was at the Social Media Club – Silicon Valley last night where there was an excellent session on “What is Influence” with Domunique LaHaix (eCairn), Scott Hirsch (getSatisfaction), Jennifer Leggio (Fortinet, ZDNet), Ryan Calo (Stanford CIS) (among others).  Great topic, great crowd. 

The first question asked of the panel was “What is influence?” and I am going to weigh in here because I don’t think that anyone got to the core of what is influence online, how you grow it, how you maintain it.  This is going to be like a Celtic design – I am going to weave in many topics that, to this point, have only been discussed disparately but, which to my mind, make up the whole quilt of influence online.

Influence can be considered the power to persuade others to some end.  Now you would say “well that’s a definition.”  But in the case of online, it is a necessary, but sufficient condition for the definition.  In many cases, people influence online with no intention to persaude.  In some cases, as in the case of a search algorithm, influence is created almost by default by the items returned from the search and the sort order in which they appear.  I mean, if this blog appears in position 31 in Google, how much influence can I have?  Or as another example, if the mullahs in Iran wanted to prevent any other viewpoints from being top of mind other than their own, they would create a search engine that only returned the results they wanted people to see. 

In order to understand what influence is online, we have to understand WHY it is. 

So the first question: why does anyone bother with social media?  Why spend your time on it?  Why actively participate in it?  The simple reason is that social media is based on gift giving – in this case the gift of information.  In his book, Influence:  The Psychology of Persausion, Robert Cialdini discusses the six “click-whirr” responses which, when triggered, can get you an automatic and predictable response from most people.  The most powerful of these is reciprocity.  Imagine for a moment I ask you for a recommendation on LinkedIn about my performance at a place where we both worked?  What’s the likelihood you will respond to the request.  If statistics are any guide, about 33%.  If, however, I first give you a recommendation on LinkedIn, the likelihood you will respond to my request is 66%. 

Why the difference?  Simple.  We are genetically wired for cooperation, not competition.  It’s how we survived as small, hairless, vulnerable proto-humans on the savannah against wild beasts and other threats.  If I do something nice for you, you feel almost obligated to do something nice for me.  If you don’t believe this is a genetic trait, then also read Frans de Wall’s Our Inner Ape, where you will see that this behavior is prevalent in chimpanzees (our nearest living cousins), as well. 

So why am I blogging?  Why do I respond to comments on Facebook?  Why do I tweet?  I mean beyond the obvious fact I may enjoy it and I can keep up with my friends.  Why does social media exist at all – why are humans wired in such a way that social media actually works?  At its most fundamental level the answer is gift giving.  People who write provide the gift of information – think about how many tweets include a link to some web page.  So, by necessity, I feel some pull to return the gift and give information back. 

That’s the fundamental mechanism that makes social media work – enjoyment, need to keep up with people and an infinite pipe of information, are built on that behavioral foundation.  Without the reciprocity rule around information, social media would be nice, but humans wouldn’t respond to the stimulus.

That is the first extension of the concept of influence.  There is a second underlying mechanism at work – and it shows up clearly in the behavior of people who grew up with the Internet from birth and those who did not.  This behavior relates to the way the generations learn and collect information  Those who are (approximately) 40+ have a different mindset.  There brains are wired (literally) for linear learning.  They read books or articles and went relatively deep into the content.  You may recall the quaint notion of “speed reading” where, in order to take in enough information, people learned to read quickly to garner “the gist” of longer articles or books.  Basically, this was the older generation’s way of dealing with information overload.  An additional technique, which approached the information overload issue by going deeper into fewer sources, was to scan the table of contents of a magazine or book and only read the articles/chapters that seemed relevant. 

The problem is the Internet doesn’t just cause information overload.  It is effectively an infinite source of data.  There is no way any human being could ever in a hundred years find and digest everything they would need to know on the subjects important to them.  Relevant results from search engines help when you want to “go deep” on a single subject, but when it comes to looking widely across all our interests, it is completely impossible to gather a reasonable subset of information by yourself.

So the 20-something generation has learned to use the eyes and ears of their peers to act as search engines for relevant content across the range of their interests.  The stream of information is in multiple parallel threads from numerous sources. We call it multitasking, but in reality it is better called multigrazing.  By staying in touch through social media, the digital generation can consume more information across a broader range of topics than any individual could do alone.  And the relevance is higher because it comes from trusted sources: their friends or people they follow who share their concerns/interests.

So tying this back to the gift of information: the need for better, more efficient means for collecting relevant information combines with the gift-giving nature of social media to create a powerful behavioral motivator for digital learners to participate in social media – and that is the why of influence as it relates to human interactions online.

What about the definition of influence as it relates to machine-based entities like Google?  As mentioned before, search engines are a tool for an individual to cut through the clutter of an infinite multithreaded data stream and find the most relevant “deep links” -  published information – on a specific topic.  They can also get relevant information from their social web of contacts (the social web, for short), but that tends to be more random, “shallower”, and less immediate.  This suggests that the search engine, by its very nature, has influence, since its algorithm determines what is useful.  The latest case of this is the recent change in Google’s algorithm to favor big brands (see Google’s Vince Update Produces Big Brand Rankings; Google Calls It A Trust “Change”), which I have been ranting about to anyone who will listen.  Basically, Google has determined that “big brands” are a more authoritative source of information about themselves than third parties.  But is that really true, say, in the case where a company’s product doesn’t work and bloggers are covering the fallout?  How many times have I been hired by Fortune 500′s to push unfavorable comments about them off the first page of the SERPS, even though they were factually accurate?  Now I don’t have to work as hard.  Google is doing the work for me.  And I’m sure that the fact that Google is trying to generate more Adwords revenue from big brands has nothing to do with it. 

THAT is the influence of an algorithm.  It is not about the power to persuade, per se.  It is about the power to choose what is relevant to a conversation, based on some programmer’s (group of programmer’s) views of what relevance means.  And no matter how much you can look to the research and say that you are following good Information Architecture design that is intended to be value neutral, it is impossible in reality to achieve that.  It’s like asking a human to manually generate a random number – it isn’t possible.  The bias can be consciously or unconsciously embedded in the algorithm, but it is there.

In other words, the search engine is like a trusted person – you can think of it as an avatar – that you also use to allow you to deal with the infinite information of the web.  Only this trusted source goes deep into a subject with immediacy, rather than helping with your multigrazing.  It is the Internet equivalent of a table of contents, whereas multigrazing is the Internet equivalent of speed reading.  Add like a trusted friend, you give the gift of information back to the search engine in the form of your click behavior, which is one of the factors our current generation of search engines use to determine the relevance of specific documents.   What is not the same in this relationship is that you cannot know what bias is built into the algorithm and how it changes over time, which it does many times without any notification by the major search engines.

So now we can go back and define influence in its online context.  Influence is the ability to share relevant information with others who share a common interest or concern, in the hopes that they will, in turn, give the gift of information back to you. Human or avatar, it is the same definition.  Both humans and search engines have their biases they bring to the conversation.  The only difference is you probably know something about the inherent biases of your friends or trusted human sources, whereas with a search engine you can only infer it, and even then it changes so often that effectively the bias can’t be known.

That’s it.  So what do you think of that logic?  Please tell.

Reblog this post [with Zemanta]
FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare

PostHeaderIcon Communication in the Tweet of a Facebook (as captured on 12seconds)

Communication.  It’s something simple to understand.  There is a message, a sender of the message and a receiver of the message.  If the receiver receives the message the sender sends, that is a communication.  Communication is nothing more than the act of sending and recieving the message.  Nothing more, nothing less?

Well, let’s take some examples.  I am sending you a message:

2380 jsdaop oj sdoppojsd ppjpodsj. 

You just received the message (we hope).  Is that communication?  I certainly doubt you feel it is.  To you and me it’s a garbage string, has no meaning.  We’ve gone through the act of communicating (in a sense), but we have not communicated.

Let’s take another:

Ad praesens ova cras pullis sunt melior

More than likely you do not speak Latin.  So to you, I have not communicated very well.  You can hypothesize from the grammar and word combinations that it is a “true” communication of some kind, but in a language you don’t understand.  If you spoke Latin, you’d know that the phrase I just wrote is “Eggs today are better than chickens tomorrow.” And if I thought you knew Latin, then that would be all the message I did send.  But since you don’t speak Latin, you may feel I’m showing off and the message you received was “I’m superior to you because I can speak an ancient language and you cannot.”   (I certainly hope you don’t feel that – as I don’t really speak Latin, I used the Web).  But I didn’t even vaguely send a message like that – so where did the message you received come from? 

In this case, the message you “received” was an emergent property of the context of the communication.  The context was the medium of the communication, the situation in which it was sent, and the psychology of the receiver (what I like to call the “communication veil”).  I didn’t send anything vaguely resembling your message, but as my message moved through the communication veil, a substitution occurred in which a new message was created because you didn’t understand the original message, which I assumed you did, and you were therefore able to insert a new message into the “hole” that existed.  You knew I was trying to communicate something, but didn’t know what.  You also assumed I knew you didn’t know Latin.  Through your veil, I was therefore purposely being abstruse.  So you “filled in the gap” as it were with your best guess of what my implied message was.  Unfortunately for me, your best guess was not what I intended.

It often amazes me  given all the communication veils,  the unpredictable situations we experience daily, and our mediums that are both noisy and odd at times, that sender and receiver come even close to having a true meeting of the minds. 

So we now have a new medium – a social medium.  It can be written, photographic, audio only, or video and audio.   When it is written, it is between 140 characters (Twitter and SMS) and 450 characters (Facebook), with many sites/services limited to the 250 character level.   Photos usually extend to 4 or 5 in a short space (with the ability to click to a larger page) and audio/video have minimal limits (unless the site, like 12seconds, limits length). 

 Written communication in 140 characters?  What kind of communication is this?  Let’s look at some examples from Twitter:


FacebookSocial Create a profile for your Twitter account! http://bit.ly/4FJRcw

 The first is an ad – and may I add (no pun intended) that we are seeing Twitter become overwhelmed with advertisements now.  From my perspective, it is getting to the point where I cannot maintain any kind of “conversation” (back to that term shortly) flow with all the noise.  If something doesn’t happen to limit this, Twitter is quickly going to find itself obsoleted.

The second is a personal note.  The third is a share about something of interest to the sender.  The fourth and fifth are an information request I made about whether others were noticing trouble with Twitter and one person’s response

What can we say about these messages and communications in this medium? 


The first three messages do not seem to request a response.  They appear to be monologues.  Are they?  Depends on the context of the receiver.  But what is unique about social media is just that fact: it makes no distinction from the sender’s or receivers’ (not the plural)  perspective.  Unless I do a direct message, I don’t necessarily expect or require a response.  I don’t even know who is listening, and I don’t care.  It’s as if I were in a cafe with hundreds of people all talking – some to themselves, and some to others – and I’m shouting a message that I hope, but can’t know, that anyone will really hear. 

The receiver’s pespective is similar but opposite.  I’m listening for specific messages among the noise.  I can tell when someone is talking to me (or even about me) because there are visual or other clues (e.g. the @sign) that allow the volume of those messages to raise above the din.  But there may be other interesting things these monologers are saying, so I can’t ignore the noise totally.  What I can do is filter the voices of only those I think I might care to hear.  This reduces the signal to noise ratio substantially, but doesn’t remove it altogether.  It may cause me to miss other messages of interest, as well.

As a receiver, I can chose to send a message back, but it is not required or expected.  That is very different from previous conversational media:  when a message was sent, a response was expected.  So, if there is a message sent and you can’t know if anyone hears it, are we really engaging in communication?  To a certain extent, this is the online version of a message in a bottle – only the ocean is filled with thousands of bottles.  Not only can I not know if someone will receive my message (even my followers), but with thousands of other messages out there, they may not be able to find it, even if they are looking for it.  Certainly as a sender I am trying to communicate, but on the surface it would appear an awfully inefficient and ineffective mechanism.  We’ll come back to this.

So what are the purposes of these communications?  Why send a message someone may receive but I can’t be sure and I can’t truly identify who will be the receiver?  We’ll also come back to that in a moment.

On the other hand, the fourth message requests a response – so it is a more traditional form of communication.  Its purpose is to get a broader perspective on an issue important to me.  I’m trying to confirm a fact or a situation and determine whether it is specific to me (a personal problem that I must take action to solve) or whether it impacts the whole community (in which case, others may be acting to solve the problem). It is also probably a timely issue, otherwise I would not necessarily send it via social media.  I say probably because that isn’t necessarily true.  Communication is about habits.  If I am in the habit of communicating mainly through social media, I may chose that as my medium, even though it may not be the most efficient.

The fifth message is a directed response to me.  It is very likely I will receive it, but not guaranteed (if I am away from my desk or phone).  But it is also sent to everyone listening, so the intent has to be to engage a broader group in the conversation.  So in that regard, it has the same basic properties as the first four messages.


What about the content of these communications?  Certainly, they are not rambly blogs on deep subjects that no one will ever read…. (I have no self-delusions on that subject).   They cannot and do not contain Ideas – even Matt Cutt’s entry doesn’t contain an Idea – it contains a pointer to something that to him contains an important Idea.  But isn’t Danny Sullivan’t comment about the Karate Kid an Idea?  Isn’t it expressing something unique?  The answer is it expresses a unique opinion, but it does not contain a new Idea – it is instead commenting on the validity of Ideas expressed in a longer and denser medium. 

You cannot express substantive thoughts – big Ideas – in 140 characters, and probably not in 450 characters.  And I am the first to say – and please hear me because I don’t want anyone thinking this is some kind of elitist claptrap – that most communication is NOT about big Ideas – otherwise we wouldn’t identify specialized subsets of humanity as visionaries, philosophers, or gurus.  The point is social media is not a vehicle, nor is it expected to be a vehicle for conceiving, developing, or communicating Ideas with a capital I.

So what about stories?  Storytelling is deeply ingrained and fundamental to human communication.  In fact, I would argue that most everything we try to communicate is done in the form of a story.  I used to love to say that as our ancestors used to sit around the campfire at night and listen to the shaman tell a story about the Gods, today we sit around the projector in a dark room listening to shamans tell us about the Gods of profit and technical wizardry. 

So does social media contain stories?

We’ll leave that to tomorrow’s entry.   


Tweet verus est rara avisA true Tweet is a rare bird

(OK, NOW I’m showing off)

FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare

PostHeaderIcon Google's Orion and Vincent

Well, even as I really want to write about Twitter and the English language, along comes Google with a new update.  Given the nature of social media, timeliness comes before etymological Godliness (when will you ever see those two words combined in a blog… I think I deserve an award for that one).  Therefore like any young techie in Spring, I turn my thoghtes (thank you Mr. Chaucer) and my feet towards a pilgrimage to the Googleplex. 

But of course, I don’t want to ignore the previous Vincent update – as that was the connection to post #1.

Orion first.  Actually Google did not announce “Orion” – which is a search technology it purchased in 2006, along with it’s college-student developer Ori Allon.  But my guess is that thanks to Greg Sterling’s new article containing that title the term “Orion Release” will stick.  Here’s how Danny Sullivan described the technology back in April 2006:

It sounds like Allon mainly developed an algorithm useful in pulling out better summaries of web pages. In other words, if you did a search, you’d be likely to get back extracted sections of pages most relevant to your query.

Ori himself wrote the following in his press release:

Orion finds pages where the content is about a topic strongly related to the key word. It then returns a section of the page, and lists other topics related to the key word so the user can pick the most relevant.

Google actually announced two changes:

Longer Snippets.  When users input queries of more than three words,  the Google results will now contain more lines of text in order to provide more information and context.   As a reminder, a snippet is a search result that starts with a dark blue title and is followed by a few lines of text.  Google’s research must have shown that regular-length snippets were not providing enough information to searchers to provide a clear preference for a result based on their longer search term – as their stated intent is to provide enhanced information that will improve the searcher’s ability to determine the relevance of items listed in the SERPs.

Having said this, I don’t see any difference.  My slav…. I mean my 12-yo son (who has been doing keyword analysis since he was 10, so no slouch at this) ran ten tests on Google to see if we could find a difference (I won’t detail all the one- and two- vs 3+ word combinations we tried – if you want to have the list, leave a comment or send a twitter to arthurofsun and I will forward it to you).  But shown below are the results for France Travel vs France Travel Guides for Northern France:

 Comparison of Two-Word and 3+ Word Search in Google Orion Release

Comparison of Two-Word and 3+ Word Search in Google Orion Release


As you can see, there is absolutely no difference in snippet length for the two searches - and this was universally true across all the searches we ran.    So I’m not sure – I wonder if Ori Allon, who wrote the post, could help us out on this one.

Also, I am somewhat confused.  If you type in more keywords, the search engine has more information by which to determine the relevance of  a result.  So why would I need more information?  Where I need more information is in the situation of a 3- keyword search, which will return a broad set of results that I will need to filter based on the information contained in a longer snippet.

Enhanced Search Associations.  The bigger enhancement – and the one that seems most likely to derive from the original Orion technology – are enhanced associations between keywords.  Basically if you type in a keyword – Ori uses the example  ”principles of physics” – then the new algorithms understand that there are other ideas related to this I may be interested in, like “Big Bang” or “Special Relativity.”  The way Google has implemented this is to put a set of related keywords at the bottom of the first SERP, which you may click on.  When you click, it returns a new set of search results based on the keyword you clicked.  Why at the bottom of the first SERP?  My hypothesis would be that if the searcher has gone to the bottom of the page, it means that they haven’t found what they are looking for.  So this is the right place in the user experience to prompt them with related keywords that they may find more relevant to the content they are seeking. 

From my perspective, this feels like the “People who liked this item also bought…” widget on most comparison shopping sites (which I know something about, having been the head of marketing for SHOP.COM.)  I’m not saying there is anything wrong with this – I’m just trying to make an analogy to the type of user experience Google is trying to create.

Shown below is an example of a enhanced search associations from a search on the broad term “credit derivatives in the USA”:

End of First SERP Showing Google's New Enhanced Association Results

End of First SERP Showing Google's New Enhanced Association Results

 As I expected, the term “credit default swaps” – which is the major form of credit derivative – shows as an associated keyword.  What I do not see in the list – and was surprised  – was any reference to the International Swaps and Derivatives Association (ISDA), which is the organization that has developed the standards and rules by which most derivatives are created.  It does, however, show up for the search on the keyword “credit default swap.”  I’d be curious to understand just exactly how the algorithm has been tuned to make trade-offs between broad concepts (i.e, credit derivatives, which is a category)) and very focused concepts (i.e. credit default swap, which is a specific product).  Maybe I can get Ori to opine on that as well, but most likely that comes under the category of secret sauce.

Anyway, fascinating and it certainly shows that Google continues to evolve the state of IR. 

Well, I’ll just have to leave the Vincent release until tomorrow.  Something else happened this morning I need to do a quick entry about.  Sigh…..

FacebookTwitterFriendFeedStumbleUponDeliciousDiggLinkedInMultiplyBlogger PostPingDiigoGoogle ReaderMySpacePlaxo PulseSphinnTechnorati FavoritesTumblrWordPressShare
Posts By Date
April 2014
« Jul