tag:blogger.com,1999:blog-39455873516696745562008-05-07T15:49:11.546-07:00.:SixorgR.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comBlogger15125tag:blogger.com,1999:blog-3945587351669674556.post-81520481052931621862007-11-11T19:32:00.000-08:002007-11-11T19:58:32.459-08:00And...We're Back!This is what happens when you take your eye off the <a href="http://www.netsol.com/">Netsol</a> ball. Sixorg mysteriously disappeared from the web, and rather abruptly, when there was no gas left in the tank over at the world's most expensive domain registrar. I'm thinking of switching my portfolio of domains over to a cheaper service soon. Anyway, it's good to be back online and in the flow...I've got a few posts in the pipeline now, covering some very clever search technologies that are worth a closer look. Stay tuned for those later this week and next.<br /><br />In the meantime, I've solved your <a href="http://www.uncrate.com/men/culture/food/dean-deluca-turkey-roulade/">Thanksgiving holiday hassle!</a> This is all you need to impress your family and the parental units... Check it out during the commercial break, while I get my site back in full working order. More to come this holiday season! ;)R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-34729547978466905532007-08-20T10:20:00.000-07:002007-08-22T10:44:45.335-07:00Being Verity in a Google World<span class="Apple-style-span" style="color: rgb(102, 102, 102);">"Real Source Content / Result Federation is alive and well"<br /><span style="font-style: italic;">- J.W. Lehman, founder of Verity</span></span><br /><br />I would be remiss in not taking the opportunity to respond to an interesting comment posted to my blog back in June on Federated Search. The response came from a founder of Verity, a leading enterprise search vendor acquired by Autonomy. The comments revitalize the debate surrounding the evolution of information retrieval and the evolution of information storage. To best clarify my position, and provide a rebuttal to the points raised by J.W., I'll provide my comments in-line and in bold following the poster's comments. The debate is on, right after the jump!<br /><span class="fullpost"><br /><span class="Apple-style-span" style="color: rgb(153, 153, 153);"> From J.W. Lehman,<br />founder of Verity<br /><br />Real Source Content / Result Federation is alive and well<br /><br />“Old Federated searchers never die, they just become…..”<br />anon<br /><br />1. The poster hasn’t a clue about the purpose of federated search in information retrieval / research. Should “federated search” take the blame for slow/poor collection access? Of course not. Federated search is NOT, as the poster claims, an “interactive” single collection search mechanism, ala google, verity or any like…it’s a “watcher-monitor” of what is going on in the info-world in specific subject areas. If the poster told his enterprise customers they were getting google-for-the-deep-web, the poster just didn’t understand their requirements….typical for IR technology vendors, and VCs. Who cares if the answer takes 5 minutes or 5 hours? The purpose of federated search is sending-alerting new relevant material as it’s generated. Federated search is a very powerful, and quick, research assistant WHEN IT IS APPLIED PROPERLY.</span><br /><br />Well, I think the opening paragraph pretty much says it all: "Who cares if the answer takes 5 minutes, or 5 hours?" Who indeed...hmmm. How about everyone. Everyone who's grown up with today's superior web search engines at their disposal. I would love to take a poll to see how many people would be willing to wait 5 minutes much less 5 hours for ANY form of search. But let us carry on.<br /><br />We must first correct the poster's bold premise because federated search clearly does not belong in the 'watcher/monitor' category. Watcher/monitors have a very distinct membership that is quite different from federated search. RSS/Atom feed readers, dashboards, and RSS aggregators such as iGoogle, NetVibes, NewsGator, Bloglines, and OriginalSignal, are watcher/monitors. They are not federated search players whatsoever. Search is pull not push. It is active not passive. Feed aggregation is passive, and it pushes. Apples and oranges here.<br /><br /><br /><span class="Apple-style-span" style="color: rgb(153, 153, 153);"> Federated search supports COMMUNITIES OF INTEREST by replacing the incredibly complex need to individually access and merge content from all appropriate sources in the search for answers (regardless of their “fun-ness” to access), with a process that does it on command.<br /></span><br />J.W. obviously hasn't read my other posting on Yahoo Pipes, and Google CSE. I offer up this as prerequisite reading material before claiming that big search engines can't address 'communities of interest' in a far easier and more powerful way.<br /><br /><span class="Apple-style-span" style="color: rgb(153, 153, 153);"> If the user can’t wait 5 minutes or 5 whatevers for results that he/she couldn’t obtain in 5 weeks-months of manual effort, then the sources themselves must be unnecessary.<br /></span><br />This remark invokes the proverbial 'wake up and smell the coffee' response. Every search engine in existence has invested millions in R & D and usability studies to unanimously confirm and conclude that speed matters. And it doesn't just matter, it is vital to achieving wide spread adoption and utility-- it is vital to survival. It is often difference between #1 in the industry and #100.<br /><br />See <a href="http://blogs.zdnet.com/BTL/?p=3925">this link</a> for empirical evidence to this point. You'll find that user adoption, and user satisfaction is of paramount importance to the search experience. A 500ms drop in response time results in millions of abandoned searches and unsatisfied users. Can you imagine what would happen if these users had to wait 100 times as long? That would be 50 seconds. How about, as J.W. suggests, 1000 times as long? I think we can all predict the outcome.<br /><br /><br /><span class="Apple-style-span" style="color: rgb(153, 153, 153);">The poster, and most of the rest of us, have fallen under the google-spell that time to first result and time-to-answer are the same. Not! How long does it take to find the fact/assumption/relationship in google/convera/verity/zylab/inxight result # 870? We’ll Never Find It, because we gave up after result 25.</span><br /><br />This is no spell. This is reality. The world has evolved people. The majority of web surfers are a few of us Gen-x'ers, Gen-Y, Z, Millennials. Most were born into this world with a cell phone in hand, and broadband, and Wi-Fi everywhere. The expectation of always on, instant gratification, and real-time computing convenience is not a nice to have in today's world, it is now merely an assumed, necessary requirement. And, they are the best and brightest generations of our time.<br /><br /><br /><span class="Apple-style-span" style="color: rgb(153, 153, 153);"> 2. “keyword” search? What century is the poster from? If you can’t explore content via explicit taxonomies with the searchrules to back them up, of course you’re going to get poor, mixed up results. [and not only is clustering is dead, dead, dead…, it was never alive!]</span><br /><br />We do agree on one point above-- clustering is not ready for prime time. Beyond that, perhaps our differences are simply generational. I am part of the Internet generation, and not a day earlier. Let's be real folks, keyword search works, it works really, really well. It is undisputedly the fastest, most popular, and most <span style="font-style: italic;">effective</span> universal mechanism for finding information today.<br /><br />Today's keyword search engines are anything but just keywords today. But my discussion is not (and has not ever been) about keyword searching. It is about federated search, and its shortcomings, and why we must everything. But for the sake of discussion here's my quick take on the state of keyword search technology: Today's 'keyword search interpretation' technologies are more intelligent, proactive, interpretative, interpolative, and extrapolative than ever before. They are capable of much more than meets the eye. But that is the point, to keep it simple to the user, to appear as if the system is 'idiot proof' and the all it takes are a few simple keywords and magic happens. This is increasingly becoming the case today. More to do, this is certain. However, keyword search is still by far the most effective input mechanism to for matching information with your intent, even if you aren't fully aware of your intent nor fully knowledgeable on the subject you pursue. See an upcoming post titled: "Browsing the Web for Knowledge Using Keyword Search."<br /><br />The industry deadpool is full of vendors that once hocked taxonomies, directories, and other structured content browsers. Taxonomies are great for very specialized collections of content, but they totally implode when mashed together by a federated search engine and 10 other content sources with totally different ontologies, categories, and metadata. It just doesn't work when blended together from completely different sources.<br /><br /><br /><span class="Apple-style-span" style="color: rgb(153, 153, 153);"> Index everything!!!!!!!!! Why bother? Keyword search will give you the same mess on an indexed collection…actually worse, because it’s only the rare and to-date, unpopular engine that recognized the presence of evidence at the meaningful text unit (i.e. paragraph) level….so instead of federated search telling you your “KEY-WORD” is actually in the title/snippet/abstract, you now get to discover the 1000x list of content where it’s anywhere in the full-text. What an advancement!</span><br /><br />Why bother, hmmm...why indeed... Well, let's see... the last time someone got the idea to do this the right way, out popped a couple of life changing web companies with worldwide adoption and sustained valuations in the tens and hundreds of billions of dollars. <div> </div><div><br />But here's a better reason: It just plain works.<br /><br /></div><div> </div><div> <div> </div><div>The real problem here is that my counterpart is mixing metaphors for comparison sake by effectively equating federated search with concept search, and earlier with watcher/monitors which are both false equations. I'm not comparing methods of retrieval. I am focused on the virtues of storing all content in a single index. And just because we've indexed everything into a single source, does not mean that we are limited to mere keyword searching for information retrieval.<br /><br />Every federated search engine, including Verity, when plugged into multiple sources for keyword searching does at least this much: pass the keyword queries to each content source wired to the federated search, and get results back from each, the keyword way. We know there are many other ways to retrieve content from a source, but this topic is and has always been about federated searching, not federated browsing, nor conceptual matching. All of which can still be done better with a single index of content anyway.<br /><br /><br /><span style="color: rgb(153, 153, 153);"> 3. Result Federation…..The ability to de-dupe, de-mystify and normalize results from multiple relevancy determination techniques has been available for years…where have you been? All that’s necessary is to make a practical relevance determination of each result based upon the search request; and order it.</span><br /><br />Regarding the existence of de-duping, etc. I distinctly don't recall saying anything to the contrary. I merely support the fact that all implementations to date do not work very well. Not one federated search engine can possibly make a reliable relevance determination based on the search query for one simple reason: it is not up to the federated engine to decide! The results that come in from each disparate content source are determined by the ranking and relevancy engine of each source's proprietary algorithm. Thus, even if the federated engine could magically infer the inter-source ranking with some degree of usefulness (though doubtful), the net results would only be as good as the worst ranking algo from the worst content source. Let's look at a simple illustration to clarify, shall we:<br /><br />Step 1: Example query: nanotechnology fabrication<br /><br />Step 2: Sources 1-5 are selected to 'federate' - assume sources 3-5 have terrible ranking engines<br /><br />Step 3: The above keywords (yes keywords J.W.) are passed to each sources' query engine<br /><br />Step 4: The "top ten" results are returned from each source's relevancy engine<br /><br />Step 5: the 50 results are some how re-ranked based on the nature of the query? I'd like to see that. Especially since the results returns are merely title, snippet, URL, and NOT full-text. As is the case with every standard enterprise and web search engine index.<br /><br />Step 6: Regardless, sources 3-5 poorly ranked documents make it impossible to unify the ranking in anything but a largely arbitrary way, and giving arbitrary credibility of the results list.<br /><br />Step 7: Because the federation technology has no way to evaluate how well a given source is ranking its own documents, it is impossible to establish a consistently high quality set of ordered results, using this antiquated yet widely suggested way of federating.<br /><br /><br /><span style="color: rgb(153, 153, 153);"> 4. In any subject, google-yahoo-ms-altavista-etc, lets you find out what everyone</span><br /><span style="color: rgb(153, 153, 153);"> else already knows…..the ability to find out what nobody else knows/surmises is</span><br /><span style="color: rgb(153, 153, 153);"> virtually denied.</span><br /><br />This belief makes one heck of a gross assumptions as to the way in which any of the aforementioned engines employ page ranking. Discovery is purely a function of the nature of the access methods to the information source, all other things being equal. With a single index of content I can create discover, knowledge, connectedness, and relatedness of concepts, sentences, subjects, and more without the need for federating a single thing. It was called Grokker 2.3 Desktop for Google, back in 2004. Today its called <a href="http://www.google.com/coop/cse/">Google CSE</a> for a single source, and for multiple sources its called Yahoo Pipes.<br /><br /><br /><span style="color: rgb(153, 153, 153);">That is what federated search is for … multi-disciplined</span><br /><span style="color: rgb(153, 153, 153);"> communities of interest seeking answers to advance knowledge, as opposed to</span><br /><span style="color: rgb(153, 153, 153);"> wikipedias-google results.<br /><br />June 11, 2007 3:35 PM</span><br /><br />Federated search as it exists today is not a social medium, and it was never intended to be. Collaborative filtering, collective intelligence on the other hand, is the future today. Has someone slept through the web2.0 phenom? digg, delicious, feedburner, flickr, Wize, Yelp, Google Reader, iGoogle. Web 2.0 companies have already categorically taken this aging notion of 'communities of interest' via metasearch tools and turned it upside down-- and actually made it work for the first time. And while all of these new web services aggregate content from a huge multiple of sources, they are not federated search engines in any sense of the word, as I have described in all of my postings.<br /><br />What's more, equating or limiting the definition of federated search to apply only to research/enterprise content versus searching public WWW content, is a significant misnomer.<br /><br />For if the best of today's web search engines were to index ALL of the available high quality, structured enterprise/research content behind the firewall (which now a few of them are doing, btw), I could then profess the end of old-school federated search, that has plagued enterprises, universities, and the world at large for over a decade now. Giving way to entirely new ways of federating, classifying, categorizing content-- but from a universal index of content with standardized metadata and shared ranking algorithms.<br /><br />So my position remains unchanged, if not reinforced. The doctor has checked the patient for a pulse, and she's still dead as a doornail. Good night and good bye my dear federator...<br /></div></div></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-83214694461966085512007-07-06T09:38:00.000-07:002007-07-06T10:56:30.368-07:00Powerset's Q & A vs. Keyword SearchPowerset would likely be the first to promote Natural Language Processing, NLP, as the future of search. Their recent <a href="http://blog.powerset.com/">blog post</a> provokes a few interesting debates about the premise of their approach to improving Web search as we know it today. In theory, natural language processing is a very attractive method of <a href="http://en.wikipedia.org/wiki/Human-computer_interaction" title="Human-computer interaction">human-computer interaction</a>. In practice, it still has its limitations.<br /><br />English is particularly challenging in this regard because it has little <a href="http://en.wikipedia.org/wiki/Inflectional_morphology" title="Inflectional morphology">inflectional morphology</a> to distinguish between parts of speech. Wikipedia has a simple little example to illustrate this point:<br /><br />English and several other languages don't specify which word an adjective applies to. For example, in the string "pretty little girls' school". <ul><li>Does the school look little?</li><li>Do the girls look little?</li><li>Do the girls look pretty?</li><li>Does the school look pretty?</li></ul>This language code is very tricky to decipher with a highest degree of accuracy and consistency necessary to provide an acceptable user experience. The full story after the jump...<br /><span class="fullpost"><br />Powerset will attempt to solve this problem with NLP and the creation of what must be an insanely massive library of ontologies in attempts to contextualize all the Web. A bold undertaking indeed. But let's set aside their pending solution and look at the potential impact to the user experience an NLP-based system would introduce. NLP works best with well formed questions, phrases, and 'contextual' descriptions. You'd be hard pressed to find NLP making improvements to results returned for some types of typical queries such as: "weather 94107" or "paris hilton" or "the police concert tour dates"<br /><br />So the question becomes this: What percentage of all Web searches would truly benefit from NLP style queries? Is it enough to make it universal or stand on its own? Or it is better served as an enhancement or feature add-on to existing web search offerings. Me thinks it is the latter. Feature, product, business. Remember the FPB test. All technologies and ideas fall into one of the three.<br /><br />NLP prefers the user to formulate semi-structured sentences to produce the best or most noticeably improved results when compared to traditional keyword searches. As stated above, this can be very handy for <span style="font-style: italic;">certain types</span> of searches, without question. But what happens if your sentence is poorly written? What if your English, French, or Spanish language skills are not up to par? What if you are unfamiliar with the host's ontologies and vocabularies for a new research topic you want to explore? Can NLP produce <span style="font-style: italic;">better</span> results in the absence of accurate or sufficient natural language input? And what of the content being retrieved? What if it too is miscategorized, or poorly structured text?<br /><br />A common solution is: categorization, classification, and taxonomic organization of content. Another is to predetermine a vocabulary for a given topic of information. Ontologies as they are better or lesser known, for any genre of information, be it politics, sports, or nanotechnology are thereby subject to the vast interpretation of the authors that create them. These authors assign meaning in ways that could be interpreted much differently from how other people, cultures, and languages understand them to be. This could create incongruence between the question and the answer, er...between the query and the results.<br /><br />Another interesting data point to bear in mind: Web searchers today are actually quite efficient and effective with keyword searching, enhanced further by increasing fluency with boolean and other advanced search operators. As such, keyword searching is often (but not always) hyper-efficient at getting the user precisely what they are looking for. Let us also remember that "keyword search" per se, doesn't necessarily equate to "keyword matching" as the sole or even primary means by which related content is return from a traditional Web search index. Today's top search engine algorithms are far more complex than simple keyword matching, counting, and/or extraction. In fact, some components of page ranking, relevance, and ordering of results pages are language/text independent. Rather, they rely on the organic substructure of the Web, and its interconnections between information that helps to paint the picture related or important subject matter. This helps tremendously in dealing with the Wild, Wild, Web that is fraught with unstructured text, errors in spelling, inconsistent or incomplete grammar and the like found in millions of web pages around the world.<br /><br />A lot of people in the industry like to assert that "not much has changed with web search over the past several years" which couldn't be further from the truth. The major search engines are enhancing their core search algo's multiple times per week in fact. The problem is that they (non-search experts, journalists, analysts) base their assessments on what they read or don't read about search in the press. Alternatively, they (new search upstarts and old dying breeds in search and enterprise search) are simply in denial, and keep telling themselves that search hasn't changed to help justify a withering existence.<br /><br />But do not fret (too much anyway) all is not lost. I do believe there are a few definitive paths to success in the web search industry for new companies with the right idea-- but only those that come prepared with their eyes wide open, and a very realistic view of where search truly is today, and where the world of end-users is influencing it from here. Without an accurate view, let's be honest, they're pretty much dead. <br /><br />As for Powerset, I have to believe they've embraced this exercise, but only time will tell. Let's see how they debut later this year.<br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-58029716769718843872007-06-07T10:52:00.000-07:002007-06-08T09:06:24.727-07:00iSearch, uSearch<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp1.blogger.com/_U37bWzxTSko/RmkR27PnnkI/AAAAAAAAANs/T4-7yzplyls/s1600-h/sixorg_shanghai.jpg"><img style="margin: 0pt 10px 10px 0pt; float: right; cursor: pointer;" src="http://bp1.blogger.com/_U37bWzxTSko/RmkR27PnnkI/AAAAAAAAANs/T4-7yzplyls/s200/sixorg_shanghai.jpg" alt="" id="BLOGGER_PHOTO_ID_5073606090523385410" border="0" /></a>Just returned from a month in Asia and Australia (inset pix of Shanghai nightlife). Fascinating centers of innovation from Tokyo to Singapore to Sydney. All have some interesting twists to next-gen Web applications and search. But that's for another time, another post. Today it's time we revisit the death of federated search (aka metasearch, single search, etc.) as we know it, and share a glimpse of what the future holds for finally solving the very elusive problem of getting at all of our information as easily as we should be able to. Ok, just a moment...ok, yep just checked, and it's still dead, dead as a doornail.<br /><span class="fullpost"><br />My friends and colleagues at Stanford University library and info-sciences department have been researching this problem head-on with over 700 databases of searchable research and academic content at their disposal. They are not alone. Countless universities, companies, and web services at large have found themselves at the end of the same dead end road.<br /><br />Single search doesn't work, nor does traditional metasearch, or any other twists on federated search. Clustering metasearched results from multiple sources into artificial categories or groups only exacerbates the problem, and thoroughly confuses the end user. (sidebar: clustering has a very long way to go before it is anywhere near ready for prime time public consumption. Until then, it has no business being in search) These approaches have, in fact, proven pointless and only further delay any attempts to arrive at an acceptable user experience for effectively accessing a multitude of content sources simultaneously. I would go as far as saying that these so called solutions are robbing these important customers of their youth. Costing not just hundreds of thousands in license fees, but years of setbacks and distractions dealing with totally ineffective solutions. Everybody seems to have an <a href="http://images.google.com/images?um=1&tab=wi&amp;hl=en&q=federated%20search">angle</a> that to me is nothing short of amusing. You'll notice through that link several spins on the same broken solution. I reviewed everything listed in those results. RIP.<br /><br />There is a reason why basic search remains so widely popular, effective, and accepted by the vast majority of info seekers. Because it works. Because it is simple and intuitive. People get it. What people don't get are kludgey attempts to mash a bunch of square pegs into a round hole. If you look at the quality of search results from any of the tens or hundreds of enterprise search vendors, metasearch peddlers, and then say, Google, what you'll find might surprise you. Or maybe it won't. Yes, obviously Google.com works, and Google Search Appliance is no different. GSA stuck to its roots from Google.com for a reason: simple and intuitive user experience and high quality results-- from ONE source. Today GSA can crawl and index virtually any type of info object or database in existence. Why bother promoting new content in separate databases? This only adds to the problem. And with Google OneBox, we go even further, wiring competing content management systems to a better Google-controlled search experience.<br /><br />So just what am I getting at? No, I'm not pimping Google's 'wares, but I am using them as one of only a few early examples of how to correctly begin to approach this problem. The answer is simple. One source. One index. One search interface. The fact that 700 databases sit in front of the info seeker is the real problem. There is no cohesive data model to support any meaningful metasearch whatsoever. "Normalizing" the boolean structure of the query language for each source's retrieval method was thought to 'standardized' the results that come back from all these random content sources. Not so. For it is not the query that matters, rather it is how the content is indexed. Just because the genre or subject nature of two content databases appears to be 'related' does not imply that the returned results will be the best combination of the two sources. Why? Because they have completely independent relational structures, metadata schemas, and ontologies.<br /><br />Federated search, as we knew it before it died, did nothing more than mask this problem with a bland search interface wrapped around a broken and discontinuous distributed data model. Despite the cold reality, many of you still employ this type of solution at an increasingly expensive cost to your company and to your users' productivity.<br /><br />But let's get back to the answer. Google introduced Universal Search, after quietly testing the concept under an alias website: <a href="http://www.searchmash.com/">searchmash.com</a>. Yep, they really do. Universal Search is not there yet, but it is a move in the right direction. Yes, even Google faced a minor federation/metasearch problem as they continued to grow laterally into new content categories, e.g. News, Photos, Videos, Blogs, Products, Scholar, etc... As a result, it became increasingly unclear whether Google.com was the right place to start a search with so many alternate entry points that may be more appropriate for certain searches, e.g.: blogsearch.google.com, or news.google.com, and many more.<br /><br />Universal Search is an early attempt to give the user a little taste of everything: pictures, videos, blogs, news, and web search results in one result page. Check out this basic example here for <a href="http://www.google.com/search?hl=en&amp;q=+steve+jobs&btnG=Search">Steve Jobs</a>. You get what I'm saying. Now, this doesn't exactly scale if you have 20, 30, or 700 types of content, or content sources to display on a page. They simply wouldn't fit. Additionally, Universal Search is more about displaying content of different types or formats versus merely different sources of content. For example, web pages, news articles, pictures, and videos are all very different types of content. I have designed two unique ways to address this problem, following some of the principles of Universal Search. Enter Integrated Search.<br /><br />The integration of content sources is where we begin. The devil is most certainly in the details for this design and implementation, but here is the gist:<br /><br />Recipe for Integrated Search<br /><br />Ingredients<br /><br /><span style="font-style: italic;">n </span>parts of unique content sources<br />1 part really nice crawler/indexer (<a href="http://nutch.org/">Nutch</a>, GSA, or Lucene)<br />1 part high quality query interface with boolean translators, NLP, and auto completion and suggestion. (See <a href="http://citeseer.ist.psu.edu/cs">CiteSeer</a> or ACM for several)<br /><br />Frappé all ingredients until smooth. Let stand and cool for 10 minutes.<br />Season to taste with one or both of the following:<br /><br />1 search index inverter (yes, the secret sauce)<br />A dash of user intent interpolation at the point of query<br /><br /><br />This solves 3 problems at once. A single index, so that no sources need be considered at query time, ever. Smart pre-query processing to help guide the search query to match the users' intent. (We'll discuss intent-driven searches, or lack thereof, in an upcoming post.) And a powerful index/ranker to ensure that every content object in the index, from every original source is uniformly considered when ordering and displaying the results that best match the query.<br /><br />This is NOT the case with traditional federators, which do nothing more than combine search results from hundreds of different indexing methodologies, with absolutely no way to 'honestly' or intelligently rank and order results that come from different indexers and ranking algos.<br /><br />So even without revealing the secret sauce, you can see how this approach is fast, simple, and aligned with traditional search user experiences. The hard part? Crawling all the content sources means writing system adapters to content to the weirdest of old school flat file DB's, obscure object databases, and a whole lot worse. But if you pick a good crawler or general search product, much of that hacking has been done for you, as with Google's Search Appliance and their <a href="http://www.google.com/enterprise/gsa/features.html">220+ adapters</a> that work pretty well out of the Box, pun intended.<br /><br />So about that secret sauce? Well with a good inference about the user's intent we can bias the search results to better cater to the user's objective. And as for index inverting, its really about inverting the results that come from the index, for a given query. Ever curious what results actually appear at the <span style="font-style: italic;">end</span> of a big web search with 5,400,000 results? How about dead middle of those 5.4 mil? Curious aren't we? Yes, it's all about discovery, and those deeper results can more useful that you might think.<br /><br />As screen real estate continues to increase on the desktop/laptop, we'll no doubt continue to see search results get 'fatter' as in wider across the page. Yes, two and three column search results are on the way. And wait till you see where the ads turn up. For search its just the beginning. For federated search, well maybe we'll call it a new beginning. But for them, this means starting over. Completely.<br /><br />So far I've yet to see any legitimate newcomers enter the arena to take up this challenge/opportunity head-on. In the meantime, partial solutions are manifesting within Web search while Google, Yahoo, and Ask continue to advance some good ideas in this arena. Yes, even Ask has been doing 'Unified' Search on their home page for a while now, and it's actually a reasonably clean UI...try out this query: <a href="http://www.ask.com/web?qsrc=167&q=iphone&amp;search=search">iPhone</a> be sure to stretch your browser as wide as it will go...not bad.<br /><br />Integrated Search, <span style="font-style: italic;">i</span>Search. Coming to a theater near you? We'll soon find out...<br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-84492995126491681932007-04-27T10:13:00.000-07:002007-04-27T11:10:54.249-07:00The Graphical Information Interface - Made Simple<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp2.blogger.com/_U37bWzxTSko/RjI2-q1gLtI/AAAAAAAAANc/eTo2VhQdAFY/s1600-h/indexed.jpg"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://bp2.blogger.com/_U37bWzxTSko/RjI2-q1gLtI/AAAAAAAAANc/eTo2VhQdAFY/s200/indexed.jpg" alt="" id="BLOGGER_PHOTO_ID_5058165781768974034" border="0" /></a>We pioneered them, and coined the 'GII' at Groxis back in 2001. The significance of new interfaces to information transcends that of any one company and any one particular problem to be solved. The sheer mass of information that we now process on any given day is orders of magnitude greater than ever before. As such, our means by which to effectively absorb, leverage, and filter information flows (or floods) must also evolve. Enter the GII. We created Grokker with the idea that information needed to be liberated from the confines of HTML and the web browser as we knew it years ago. It was our "1.0" attempt to create a universal information currency with the capacity to convey far more usefulness than a list of 10 search results on a web page. Grokker sparked a small but potent movement along these lines, paving the way for new approaches to opening the information bottleneck at the point of consumption. Many great advances have emerged, showing promise for the future of information experiences to come.<br /><br />Sixorg is all about information, search, and the like. As such, you can expect to find many cool, new examples of the GII that fit the bill in future posts. Today I'm sharing a 'blog find' that ranks high on the list for one of my good friends, and is now high on my list. It's called <a href="http://indexed.blogspot.com/">Indexed</a> by Jessica Hagy. In a word, brilliant. A witty and creative smattering of hand drawn Venn diagrams, scatter graphs, and more, representing mathematical theorem proofs of every day life- infoviz style. The clever and entertaining diagrams and graphs convey information in spades- further proving another theorem that a picture is worth a thousands words. Indexed is a simple but great example of GII's in action. There is no reason to explain it further, because the examples so clearly speak for themselves. 'Nuff said.<br /><br />Take a look, bookmark Indexed. It's fresh air for your creative mind. Thanks to my friend and colleague Rosie for the link. Rosie doesn't have a blog, but should. Girl's got skills. Maybe this will kick her in gear to get her game on! More to come. Stay tuned...<br>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-62980092138739291242007-04-03T22:53:00.000-07:002007-04-05T22:25:24.538-07:00Personalized Google Mashups - On The Fly<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_U37bWzxTSko/RhNDt2CMDwI/AAAAAAAAANE/RbQ-rTc5AtU/s1600-h/gmap.jpg"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://bp0.blogger.com/_U37bWzxTSko/RhNDt2CMDwI/AAAAAAAAANE/RbQ-rTc5AtU/s200/gmap.jpg" alt="" id="BLOGGER_PHOTO_ID_5049454062090325762" border="0" /></a>If you haven't used JSON, you're missing out. If you haven't heard of it, your just out of it period. JSON is a great data interchange format, that Google utilizes to streamline their first mashup wizard for Google Maps. <span class="fullpost">It's a simple alternative to coding (certain) server-side proxy's for http requests to get to data in the form of JSON feeds. JSON liberated this extremely cool <a href="http://googlemapsapi.blogspot.com/2007/03/creating-dynamic-client-side-maps.html">mashup wizard</a> at Google a few days ago. Zero coding required to build very useful Google maps mashups of your own from your own Google Spreadsheet table. Reminds me of XQuery's thin client-side data extraction properties. Not surprising. Hmmm...XQuery for JSON...we could really be on to something. At any rate, for this example, you have to get your data into Google's Spreadsheet first, but that's far simpler that coding a mashup from scratch. This is the power of great front-side middleware, making custom app building truly user friendly. An excellent step forward that will no doubt unleash a new bevy of corporate, personal, and startup mashups. My first mashup to follow...<br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-54916718317281829242007-04-03T10:18:00.000-07:002007-04-03T23:37:23.340-07:00Federated Search is dead, dead, dead...I've been asked to write about this for some time, and that time has finally come. <a href="http://radar.oreilly.com/archives/2007/03/greg_linden_on.html">O'Reilly touched on the subject recently</a> discussing Google's plans in this arena. But white papers on the subject are not required to explain why traditional approaches to this dilemma are toast. In this post I'll explain why federation is broken and how corporations, universities, and start-ups continue to throw $ at the wrong end of the problem...click below to dive in!<br /><span class="fullpost"><br /><span style="font-weight: bold; font-style: italic;">Federation defined</span><br />Federated search is the art of attempting to execute a single keyword search across <span style="font-style: italic; font-weight: bold;">n</span> number of databases, content sources, indexes, news feeds, etc. This is also known as metasearch, deep-web search, and content aggregation. No central federated index is maintained and no crawling or spidering required. The idea, in theory anyway, is certainly a convenient one. At <a href="http://www.grokker.com/">Groxis</a>, 90% of our customers were most interested in federating their enterprise content sources. In large companies and universities alike the sea of available content silos for any given organization is vast. It is not uncommon to find hundreds, even thousands of content sources used across a single organization.<br /><br />Federation is all about passing the user's search query separately to each of those search engines, and collecting <span style="font-style: italic; font-weight: bold;">x</span> number of search results from each source, and then figuring out how to display them to the user in some meaningful, useful, <span style="font-style: italic; font-weight: bold;">actionable</span> way. This display challenge occurs because typically many of the content sources are not crawlable or spiderable due to licensing issues, ownership of the content, or because the content resides in a data store that is not crawable. Examples: SQL databases, proprietary content management systems, and commercial content such Lexis, Factiva, Reuters, etc. Federated search is a quick and dirty way to scan across a vast array of content sources.<br /><br /><span style="font-style: italic; font-weight: bold;">Federated challenges</span><br />However, there is a fundamental usability and search logic problem with today's generic search federation. Let's presume you are federating just 20 content sources into a single search query interface. Using a traditional search results display format, 10 results per page, we arrive at problem #1: results ordering and display.<br /><ol><li>What happens if all 20 sources return an average of 60 results for a given query? How are the results combined and displayed intelligently? The first ten results have only a chance of display at best, 50% of the <span style="font-style: italic;">breadth </span>of the corpus. From a usability standpoint, federation demands a results display that best accommodates breadth and depth simultaneously.</li><li>Each result set from each source uses its own unique 'relevance' ranking algorithm. Once you have the ordered result set from each source, how do you compare and order the combined results across different data sources?</li></ol><span style="font-weight: bold; font-style: italic;">Arbitrary solutions (aka common hacks)</span><br />A. Should we apply a weighting alogithm to each of the sources to favor more 'important' sources. Sure we could. But this arbitrary not contextual, and thus totally inefficient.<br /><br />B. Should we apply speed? First results to come back get displayed first? Hardly contextual, hardly consistent nor sufficient. A poor man's federation to be sure. More on performance issues later.<br /><br />C. How about ordering all the results into topic clusters? Sounds great, this allows us to organize all the results from all of our 20 sources into a cluster map, organized by topics, not content sources. On the surface this could indeed address some of federations shortcomings. However the problem is that topic clustering technology is woefully inadequate for serious research or just serious federation. I've reviewed, licensed, and tested every serious clustering engine in development, and even hacked together my own clustering algorithms over the past several years. They all have a common problem: They require optimization and customization to each and every content source, and never work consistently enough to overcome mass user adoption. They require unique stop word lists, phrase delineations, dictionaries, cluster tuning, label tuning, and a host of other tweaks. I could go deep here, but let's not get off topic. In fact, wait for my next posting that illustrates why document clustering is also dead, dead, dead.<br /><br />I mentioned speed above. The other big usability problem is the speed at which each source returns results. Often times we cannot produce a combined results set because the federation engine is waiting on sources to return with their results. Some sources can be woefully slow, causing totaly response times to take up to 3-5 minutes! Yes, I've seen this in production at large enterprise sites. This is how to cream mass user adoption in about exactly... 3-5 minutes.<br /><br />D. Another common 'solution' is to let the user pre-select the content sources from which to federate the keyword search. Sounds reasonable on the surface, until you have 20 or 700 data sources to choose from. Even grouping them together leaves too much to the imagination from a usability standpoint. User's aren't trained to 'think' about these intricacies, they just search and go. Advanced Search panes are rarely utilized correctly, if at all. Further, most users will know much less about each content source than the federation platform does. As such, having source selection choices is a massive burden on the user if there are more than 7-12 sources to choose from. In the end, this does not solve the problem, in most cases it adds to it.<br /><br /><span style="font-style: italic; font-weight: bold;">The real solution - does one really exist?<br /></span>Wouldn't it be nice if there were a simple, elegant solution to this most vexing problem? Librarians, universities, researchers, and knowledge enterprises would rejoice with a resounding thunder! And the company with the solution would similarly rejoice in the prying open of even the tightest purse strings of customers vying to get their hands on the proven solution once and for all.<br /><br />Well, there is good news and bad news. The good news, there is an obvious solution. The bad news is, that is really, really, really hard to do. The solution: index everything. (Note: this is not the same as metasearch, which only aggregates results from separate search engines, as metasearch has no indexing capability...for now ;) One index one result set for all content online. Yes, I said it. If literally every content source were opened up to be crawled and indexed without prejudice, a single, uniform index could go to work providing users the most useful results from a single search. Sort of like removing DRM from digital music, in a way, I suppose. Let the content be free! The difference being, premium content publishers would not have to open up the body of the content to the end user. Just look at how <a href="http://search.yahoo.com/subscriptions">Yahoo</a> and others handle <a href="http://search.yahoo.com/subscriptions">searching premium content</a>. You can access the metadata (title, author, abstract, summary, etc.) and then you pay to gain access to the full text. [ Paying for content is yet another topic all together. Yes, it too is dead, dead, dead...] Yahoo's subscription content federation is an example of the "index everything" solution on a much smaller scale. Though this implementation is only partially effective here and for only small groups of content sources.<br /><br />In theory a web index such as Google.com, is a <span style="font-style: italic;">federated index</span> of sorts, culling together millions of small and large 'content sources' known as websites into a centralized search index. Fundamentally no different from metasearching, but architecturally and contextually vastly different user experiences and effectiveness.<br /><br />The bad news is also obvious. It is seemingly impossible to get all content sources opened up, and indexed anytime soon. Not to mention the privacy, copyright, formatting, and global policy issues that surround the notion. Just look at all the flak Google gets for scanning books in a library. Given this, might there be another way? Another approach that achieves maximum usability, and extracts maximum value from any cross section of content sources for the user, the researcher, the knowledge worker? I believe there is such a solution in development today. For hints as to the direction of such an approach, let me point you to a few successful 'mini-federators' in the web2.0 world that are really effective.<br /><ul><li>Take a look at: <a href="http://www.originalsignal.com/">Original Signal</a> (look beyond their new blog style home page to the 'channels' of aggregated content) - a simple example to be sure, and nothing breakthrough per say with the user experience. Rather effective just the same.</li><li>Take a look at the approach taken by Yahoo, and improved by Google with the 'personalized' home pages that allow you to customize your content aggregation into an RSS + Ajax dashboard of sorts. There are scores of web2.0 RSS aggregators and some really clever dashboards out there, that are planting the seeds for something much bigger.</li><li>But those are what I call the 'lay-ups' or the obvious choices. Less obvious but closer to what the future holds for federation include: <a href="http://google.com/coop/cse/">Google CSE</a> and <a href="http://pipes.yahoo.com/pipes/">Yahoo Pipes</a> -- think social computing meets vertical search while killing metasearch...<br /></li></ul>The current design of the 'dashboard' as we know it does not scale to support <span style="font-style: italic; font-weight: bold;">n</span> number of content sources, and certainly not 700 or 1000, but CSE and Pipes are a very different story. Essentially do-it-yourself federated indexes as I described earlier. Very high potential. Particularly as screen real-estate runs at and all time premium today. As such, if we are to arrive at true front-end solution versus a back-end (index everything) solution, it has to scale. It must also remain simple, efficient, and require an almost zero learning curve. New visual metaphors have only exacerbated the adoption and usability problems that plague most federation solutions in the market today.<br /><br />If search federation sounds like a rather elusive problem to solve, I can promise you, elusive is an understatement. The answer lies in how we interact with, process, and digest information <span style="font-weight: bold;">instinctively</span>, not 'intuitively' as most info designers would have you believe. Intuition-driven approaches only lead to new products and solutions that chase their own tail, never really solving the problem at hand. We have seen, and will see yet more companies come and go with their valiant attempts to crack the code for federated search. But until the real problems with federation are truly understood, be prepared for more tail wagging. Ironically enough however, it appears to me a solution might soon be launched...right under our noses...hmmm. As always, stay tuned!<br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-15619054715536757742007-03-21T10:54:00.000-07:002007-03-29T10:23:02.840-07:00gPhone rumors squashed...for nowThis <a href="http://www.engadget.com/2007/03/21/google-phone-rumors-shot-down-for-the-moment/">gPhone news </a>picked up by Engadget...<br /><br />This confirms my previous post. Software is the logical start for Google, and while a phone isn't out of the question for Google, the hardware business particularly mobile phones is an entirely different animal for the company. Yes, they do design their own data center servers, and their own GSA hardware, so they are not without experience in the space.<br /><br /><span class="fullpost"><br /><br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-82190770817924233412007-03-12T23:35:00.000-07:002007-03-14T21:39:01.055-07:00iPhone and gPhone - together as one?I am pretty sure that this is not the phone that Google is going to ship, however this 'rumored to be' insider snapshot of the 'gPhone' prototype is more about the software integration with the OS at this stage, and less about the hardware enclosing it.<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_U37bWzxTSko/RfZICI8sfoI/AAAAAAAAAMI/83zZvhbMRiI/s1600-h/iphone.jpg"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://bp0.blogger.com/_U37bWzxTSko/RfZICI8sfoI/AAAAAAAAAMI/83zZvhbMRiI/s200/iphone.jpg" alt="" id="BLOGGER_PHOTO_ID_5041296034486845058" border="0" /></a><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp3.blogger.com/_U37bWzxTSko/RfZIZ48sfrI/AAAAAAAAAMg/dH5NlluyPSA/s1600-h/gphone.jpg"><span style="font-weight: bold;"></span><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp3.blogger.com/_U37bWzxTSko/RfZIZ48sfrI/AAAAAAAAAMg/dH5NlluyPSA/s200/gphone.jpg" alt="" id="BLOGGER_PHOTO_ID_5041296442508738226" border="0" /></a><br />Either way, the working name probably won't last long given the gPhone from <a href="http://www.gphone.com/">GlobalPhone Corporation</a>, and the gPhone from <a href="http://gphone.sourceforge.net/">Gnome-o-Phone</a>, the open source Skype-like software. However, Google would by far have the most interesting use for the gPhone moniker. In fact, when you consider the power of the g-Apps that come out of Google.com, it is quite compelling to see the potential, even with just the first few 1.0 mobile phone apps from Google: <a href="http://www.google.com/mobile/index.html">search, gmail, maps, and news</a>. The Java Midlets for gmail and maps are particularly impressive for mere 1.0 applications. Satellite imagery straight to your Samsung Blackjack on the 3G network is the bomb.<br /><br />But the killer 'app' per se is not going to be any one g-App, rather it will be the seamless <span style="font-style: italic;">integration</span> of these applications to the phone that will make them compelling. Google almost doesn't need to market a phone at all, just the platform. However, if they do, it will only be to make the integration airtight. Windows Mobile 5 is just not there yet. Apple's iPhone? Not here <span style="font-style: italic;">yet</span>, but coming. I think iPhone will be great, and the <a href="http://cs.nyu.edu/%7Ejhan/ftirtouch/">multi-touch interface</a> will be killer. But will iLife be too much overhead, and thus overkill for mobile productivity? Will we really want iPhoto on our phones? Definitely a nice to have either way, but hardcore productivity remains to be seen. Yes, I'm sure to buy at least one iPhone in June, but I'd really like to see what Google can do here as well especially for true mobile productivity. Their notoriously lightweight, super-fast Web apps just work. And at 1.0, they work better than most of the 4th generation WM5 apps on the Blackjack.<br /><br />Best of both worlds? How about a 'native' Google mobile suite for iPhone, to include all the apps from gmail to <a href="http://www.jot.com/">Jot</a>. Yes, I know, Steve already mentioned that there would be <span style="font-style: italic;">some</span> Google integration with iPhone at launch time. But how much integration is the question, especially in light of all this gPhone speculation. iPhone seems like a more straightforward entry point for Google, but in this business nothing is straightforward and anything is possible...should be an interesting summer.<br>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-41388451093805557132007-03-05T10:32:00.000-08:002007-03-05T13:17:17.458-08:00Yahoo's Response: The "Real" Declaration<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp3.blogger.com/_U37bWzxTSko/Rex0f6tmmTI/AAAAAAAAALw/WO343rsuhsU/s1600-h/yahoologo210.gif"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://bp3.blogger.com/_U37bWzxTSko/Rex0f6tmmTI/AAAAAAAAALw/WO343rsuhsU/s320/yahoologo210.gif" alt="" id="BLOGGER_PHOTO_ID_5038530174806038834" border="0" /></a>Following my earlier post regarding Yahoo's recent restructuring efforts, Yahoo responds. The story was first illuminated by TechCrunch's posting of an internal email from their CFO. As it turns out, Yahoo wasn't too pleased with their posting-- surprise, surprise. Nonetheless, Yahoo management was quick, and kind enough to respond to my <a href="http://www.sixorg.com/2007/02/yahoos-declaration-of-dependence.html">analysis</a> of the posting here at Sixorg. In fact, I was surprised to hear from quite of few of my friends at Yahoo, all eager to get the story updated. So here's what we now know...<br /><span class="fullpost"><br />In my posting I stated that Tim Cadogan was heading up core search. Unfortunately that is not the case. Tim is instead heading up Search and Listings Marketplaces, which is essentially search advertising, not core search. Core Search as it turns out, resides in separate organization called the Audience Group under the leadership of Jeff Weiner as part of his Network division. "The Network division is now comprised of five areas including Search, Community & Communications, Front Doors, News &amp; Information and Entertainment. We believe this new structure will allow us to better align our strategy with the organization and deliver on its mission to "connect people to their passions, their communities and the<br />world's knowledge, " states a Yahoo! spokesperson.<br /><br />Jeff's been in various M&A, and search roles since Terry Semel brought him over from his Hollywood venture investment firm, Windsor Digital, where Jeff was doing M&amp;A work for Terry's deals. Jeff is a good guy, and well liked by some hard-core senior search gurus inside Yahoo that I know personally. Yet some folks there tell me they find it curious how Jeff was nominated to run a multi-billion dollar core search group, with no search background prior to Yahoo. Remember, in the Valley, it's not always what you know, it's who you know.<br /><br />Eckart Walther and Andrew Braccia head up Search within the Network division, which is a good thing. Eckart is sharp, and the real deal in core search. I enjoyed engaging with Eckart on core search innovation throughout the partnership between our companies.<br /><br />Yahoo also confirms that all teams dedicated to Panama, Yahoo's new search advertising platform, are housed under Sue Decker.<br /><br />The only real problem I see with this emerging structure is NOT the suggested <a href="http://online.wsj.com/public/article/SB116379821933826657-0mbjXoHnQwDMFH_PVeb_jqe3Chk_20061125.html">peanut butter</a> being spread around the company. In fact, a company of Yahoo's scale and global reach requires at least this much infrastructure and yes, bureaucracy, to continue to scale and drive growth efficiently. Yahoo faces a similar problem that many new entrants into the search and search advertising business face today. It's the chicken and egg dilemma. You'll notice that Yahoo has organized itself into a pair of Supply and Demand groups, to become better market makers for the advertising business.<br /><br />But what good is an rich network of advertisers without premium inventory across the web publishers' real estate on which to run it? And conversely, what good is a rich publisher base, if the ads and ad serving technology can't stand up very well against the competition? Tim's group inside Yahoo is perhaps the most vital in this equation. However, Yahoo's creation of three peer groups in APG: Supply, Demand, and Products may very well create more challenges down the road. The "magic in the middle" as Yahoo describes it grossly underestimates the role that Panama and other core technology must play in Yahoo's latest competitive bid. It is far more germane to Yahoo's growth, than simply the glue between APG's supply and demand. Just ask Wall Street:<br /><br />All of Yahoo's recent stock activity is based not on new divisions, roles, or titles, but solely on the promise of Panama, a content matching technology. Mark Morrissey has been promoted to SVP of APG Product Management, and as such plays a key role in clarifying this company wide. This is one thing that Google has not lost sight of, even amidst their mesmerizing growth.<br /><br />At a bare minimum, a dotted line to core search is a must (but we all know that would just be cheating). I just can't rationalize keeping core search and core ad serving technologies in two very different parts of the organization, because of their revenue generating power. The technologies are very interdependent, and strategically linked to driving web traffic, click-throughs, and loyalty (trusted search). Remember, keyword searching isn't the only way we search anymore.<br /><br />Let us not forget, the true 'Audience' as Yahoo puts it, is the end user, the web surfer, the web searcher. And this Audience is a key ingredient to the supply and demand channels for both publishers and advertisers.<br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-13696968561779168212007-02-20T10:58:00.000-08:002007-04-03T12:10:08.129-07:00Design Simple, Part 1: Enterprise 2.0<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_U37bWzxTSko/Rd4Jexn1VNI/AAAAAAAAALk/afzxkl8yiOg/s1600-h/web2.jpg"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 125px; height: 126px;" src="http://bp0.blogger.com/_U37bWzxTSko/Rd4Jexn1VNI/AAAAAAAAALk/afzxkl8yiOg/s320/web2.jpg" alt="" id="BLOGGER_PHOTO_ID_5034471857767929042" border="0" /></a>A question I am constantly being asked by up and coming tech entrepreneurs: "How can we better leverage the web, and new web2.0 models to spark growth or create a new market segment or revenue channel altogether?" But now I being asked the same questions by serial entrepreneurs and seasoned fortune 1000 executives alike. The answer is not the same for all scenarios, but indeed there are some battle-tested rules that everyone should know. In a multi-part blog series, I will be addressing various aspects of this new era and the dilemmas that most companies, old and new, will come to face at some point in the not too distant future.<br /><br /><span class="fullpost">In my mind, this question can not be answered unless you've built online businesses in both web worlds, Web 1.0 and Web 2.0. I've sat on countless panel discussions and debated with a wide variety of journalists, analysts, bloggers, and 'pundits' whom all have their own 'unique' spin on the subject. Sideline referee versus quarterback perspective, I suppose. Mostly what you hear from this crowd are catch phrases like: the long tail, viral marketing, scalability, network effects, social mediums, participation age, the list goes on and on. Judging by the ever expanding <a href="http://en.wikipedia.org/wiki/Web_2">definition for Web 2.0</a> at Wikipedia, it's no wonder that many joining the fray quickly become shrouded in the vagueness of their own Web 2.0 execution.<br /><br />As you might expect, Tim O'Reilly presents a reasonably <a href="http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html?page=1">sound analysis</a> of the web2.0 conceptual framework, and the new 'levers' that can influence the adoption of a web business, product, or service. A good primer to get you oriented before attempting the recommendations that follow. Bear in mind that there is a big difference between the conceptualization of web2 strategies and real world execution and producing tangible, measurable results. Don't be fooled into thinking there are shortcuts and quick hits by simply regurgitating what you've read online. The list here is by no means complete how-to, rather, it is dialed in to address the most common problems with executing the vision.<br /><br />1. Get focused. Converge on single idea. Don't zoom out. Resist the temptation to lump 10 separate features, 5 unique value propositions, or even just 2 different products into something you defend as your sole mission.<br /><br />2. Repeat step #1<br /><br />3. Repeat step #2, this time have a colleague or customer test your focus for clarity.<br /><br />Let the value proposition sell itself. Too often people equate Web 2.0 with Hype 2.0. A marketing blitz will only dilute the DNA of the product's key value to reach the user in a meaningful, sustainable way. You need to get the product right out in front of the audience. If you cannot articulate or demonstrate the core value proposition with one picture, or one sentence, all subsequent strategies to leverage the Web effect will be ineffective. I see this all the time with the companies I am advising today. Great products, great teams, great ideas, but all of the value creation gets buried behind logins, passwords, downloads, and other forms of "friction in the adoption curve", as I have coined it. This friction does nothing but keep the best assets and value from being discovered and adopted in market place. It's sort of like self-inflicted wounds that never heal quite right.<br /><br />This is most prevalent with enterprise companies trying to become enterprise 2.0 companies overnight. I've watch <a href="http://www.cisco.com/">Cisco</a> try it, <a href="http://www.sun.com/">Sun</a> try it, and hoards of other companies large and small. Each with varying (read: minimal) degrees of success thus far. And without some serious reprogramming, none will truly garner the uptake of the web2.0 pro's like <a href="http://www.jot.com/">Jot</a>, (which at the moment is off line for new customers because of the recent <a href="http://www.jot.com/google/faq.html">acquisition</a>).<br /><br />What constitutes a good example? First, it is the company whose product or service <span style="font-style: italic;">is</span> the website itself, and vice versa. Think about that for a moment. Second, the Web must be an integral part of the value proposition, and an integral growth driver. More on this topic in part 2 of the series to follow.<br /><br />What constitutes a weak example? Brochureware is not 2.0. Demoware is not 2.0. Screenshots, Flash movies, and 30-day trials are so not 2.0.<br /><br />The challenges most companies face with embracing the Web and '2.0' as a new market for growth can be complex and subtle. Conventional sales and marketing techniques only create surface-level awareness. They don't spark adoption and they don't promote viral uptake. Only an immediately recognizable value proposition and a legitimate social incentive will put 2.0 in motion. Instant gratification is vital. Without it, it's only a matter of time before you're back to the drawing board.<br /><br />Enterprise companies today have tremendous potential for growth in a 2.0 world because of the vast asset bases they are sitting on. Unlocking new potential and new markets for these assets can conceivable give them a tremendous unfair advantage. Though it is indeed an art and a science to getting it right; and producing tangible, scalable results doesn't come easy. What's more, the process can easily backfire without a qualified team in place to execute. Brands, credibility, and market share can easily be swept away with ill-fated attempts to join the "in" crowd on whim. Same can be said for start-ups. Though the assets are not necessarily vast, they are potent, high-value, fresh concepts that risk becoming the greatest technical innovation that never took off. 2.0 is a powerful concept that when used responsibility, can make reaching critical mass, a closer reality. More to come. Stay tuned.<br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-83635310764576728222007-02-18T22:15:00.000-08:002007-02-19T00:06:22.857-08:00Joost Releases 0.8.0 Update<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.joost.com/"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="http://bp1.blogger.com/_U37bWzxTSko/RdlOERn1VLI/AAAAAAAAALE/46K0QObnboQ/s320/joost_001_en_150x150_white.jpg" alt="" id="BLOGGER_PHOTO_ID_5033139893920158898" border="0" /></a><br />Joost updated its 'wares this weekend with version 0.8.0, which includes the first version now available for Intel-based Mac OS X. One of the few products I've seen released lately that was not Universal Binary right from Beta. I suspect it has to do with the new CoreAVC Video decoder, which significantly improves the quality of Joost video. That said, Joost appears to be loosely promising a Universal Binary for RC1.<br /><span class="fullpost"><br />As for the product itself, the Mac OS X Intel installs and runs pretty smoothly for a 0.8 revision. The UI is sleek, but takes a little getting used to, and lacks logical flow (read: arbitrary). The My Channels browser element is a rather bland one column scrolling list which is rather inefficient and unimaginative for browsing loads of channels. I am certain they can do better with search, browse, and navigate. The stock channels are fun and the episodes launch immediately, giving you an experience quite similar to flipping through channels on a real TV. Quite natural in that regard. Perhaps their strongest achievement indeed. But is this enough to establish a sustainable advantage that is strong enough to garner real market share in this already crowded sector? I'm most interested to see how this will stack up against <a href="http://www.apple.com/appletv/">Apple TV</a>, and a slew of existing competitors like <a href="http://www.bittorrent.com/">BitTorrent</a>, and Google/YouTube.<br /><br />Joost can deliver higher resolution video than most consumer offerings today, and fast. No need to deal with downloading torrent files, then switching to a content viewer, and then back to torrent download for each video. Very cumbersome compared to Joost. However, this is still just a well executed feature-- not a product, nor a business...yet. Functionality must yield social incentive, which then follows content as king in this business. After Joost'ing just about all the channels in the Beta, I'm still left wondering what will be the "hook" with Joost that will keep me coming back for more. If the service backbone can scale to the likes of BitTorrent's mesh, they've got to then differentiate themselves with more clever social experience (read: incentives) to drive adoption and contribution to the Joost channel line-up, while beating everyone else to the best quasi-commercial content available. Tall order indeed.<br /><br />And let us not forget what happened when Microsoft, at arms length, launched <a href="http://www.kbcafe.com/myspace/?guid=20061218173713">Wallop</a>.<br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-62293709048825112042007-02-15T09:32:00.000-08:002007-04-03T12:11:00.142-07:00Yahoo's Declaration of Dependence<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_U37bWzxTSko/RdSZnEH-BqI/AAAAAAAAAKo/gXj1mY79ihE/s1600-h/yahoologo210.gif"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://bp0.blogger.com/_U37bWzxTSko/RdSZnEH-BqI/AAAAAAAAAKo/gXj1mY79ihE/s320/yahoologo210.gif" alt="" id="BLOGGER_PHOTO_ID_5031815580080801442" border="0" /></a>As Mike puts it today on <a href="http://www.techcrunch.com/2007/02/14/text-of-email-to-all-yahoos">TechCrunch</a>, a <span style="font-weight: bold;">lot</span> of peanut butter bein' spread around Yahoo again with Sue Decker's <a href="http://www.techcrunch.com/2007/02/14/text-of-email-to-all-yahoos">email</a> sent out company-wide this morning. I had dinner with Sue a couple years ago, and ultimately completed a few <a href="http://www.groxis.com/archives/nytimes_050905.htm">strategic deals</a> between Yahoo and Groxis, the company I co-founded and ran for the past five or so years. What I can tell you is this. Sue is a smart, driven executive. No doubt she'll get the operational job done better than anyone at Yahoo today. However, I question a few of the organizational moves she's outlined in this 2007 manifesto. Perhaps the most glaring and most critical to Yahoo is where their core search group gets parked inside the organization. According to the email, core search is now part of "Marketing Products" which I find rather curious.<br /><br />The three most important rules for any startup company in the Valley are: focus, focus, focus.<br /><br />The good news is that Tim Cadogan has been tapped and promoted to head up the core search organization. Tim is a very smart, capable, and cool guy, with whom I structured our 'landmark' deal with Yahoo. Landmark in the sense that we were one of the first search companies to get Yahoo and Google to play nice in the same small sandbox that was my startup. Anyway, Tim is unquestionably the right guy for the job, and needs his group to be elevated to have a greater effect on how the core search is being leveraged not as a marketing product, but as a supply-side horizontal revenue generator across the company. The new structure could put the company at risk by layering it vertically into the organization. If you look at Google, core search is far bigger, and far more horizontally aligned across the company. And for good reason. It (along with AdWords) is their bread and butter. It used to be Yahoo's bread and butter too.<br /><br />I can't help but think that Yahoo's acquisition spree has precipitated this latest move. With so many seemingly random acquisitions to keep pace with competition, the company has had to react vs. act. This leads to a contrived strategy based on the new assets they have to deal with and leverage effectively to please Wall Street (read: justify the acquisitions). Whereas you'll notice Google to be slightly more proactive, with a seemingly tight-lipped strategically planned growth strategy. We'll be watching closing to see how this new plan plays out for Yahoo. Don't lose sight of the importance of core search, Sue, Yahoo depends on it. And remember, it's never too late to focus, focus, focus.R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-87534040895493453762007-02-14T10:34:00.000-08:002007-04-03T12:08:41.600-07:00Get a (Second) Life!Walking to my office this morning, I caught today's SF Chronicle headline at a newsstand outside Whole Foods: "Love 2.0" I thought to myself, hmmm...fitting for Valentine's Day in Silicon Valley I suppose. Perhaps even more fitting is the latest target market for online matchmaking, it's the "I sold my company for $80m, and Vegas is getting too expensive to find a 'girlfriend,'"crowd. Or it's the: "MySpace? Who wants to wind up on 'To Catch a Predator'?" crowd.<br /><br />As Min Jung Kim so eloquently put it in <a href="http://sfgate.com/cgi-bin/article.cgi?f=/c/a/2007/02/14/MNGEVO4DOV1.DTL">today's festive piece</a>: <span id="articlebody">"They are using the actual tools of Web 2.0 to find more effective ways to get laid." Hey, whatever works. Happy Valentine's Day everyone. Here's my suggestion for Valentine’s Day: It is time for these peeps to get a <a href="http://www.secondlife.com/">Second Life™</a>.<br /><br /></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.secondlife.com/"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://bp1.blogger.com/_U37bWzxTSko/RdN00kH-BpI/AAAAAAAAAKc/drFkjUkOr20/s320/secondlifelogo.gif" alt="" id="BLOGGER_PHOTO_ID_5031493655102097042" border="0" /></a><span id="articlebody">Why not? </span><span id="articlebody">Th</span><span id="articlebody">is</span><span id="articlebody"> virtual world is almost better than reality at times. As I </span><span id="articlebody">mentioned in my opening blog, Second Life is a big deal. Bigger than most people realize or have taken the time to contemp</span><span id="articlebody">late properly.<br /><br />Quick Disclaimer: Remember what this blog is all about: separating the truth from the naïve, and from those that write before they think.<br /><br />I was saving my Second Life thoughts for a later blog, until I read this Second Life <a href="http://www.itbusinessedge.com/blogs/tve/index.php/2007/02/14/second-thoughts-on-second-life/">story</a> posted today over at IT Business Edge. So here we go...<br /><span class="fullpost">The author attempts to question the business value inside Second Life’s virtual world. The piece lacks teeth, substance, and interest. Check out the author’s bio on this blog:<br /><br />“Ann was a leading media authority on automated teller machines before coming to IT Business Edge to cover tech alignment and business value.”<br /><br />Did she say ATM machines? This is exactly the type of journalistic wisdom that needs a healthy challenge, especially if we are talking about something as disruptive as Second Life. Why waste the net bandwidth and disk space necessary to carry that story at all? I’m at a loss on that one. The author even falls on the sword at the end of her own story, stopping just short of throwing in the towel on the whole piece:<br /><br />“When in doubt, we go back to the advice of a Booz Allen Hamilton associate that appeared in a recent article on Second Life: Companies need to ask, “What can we do better in Second Life than the other ways we’re already doing them?”<br /><br />Booz Allen Hamilton? The next time I am in doubt about the future of innovation and Silicon Valley startups, the only “Booz” I’ll be deferring to is Don Julio down at <a href="http://www.tresagaves.com/">Tres Agaves</a>. Is that not the most naïve question on this subject? If they were clever, they would have pushed that question to the very beginning of the article, to invite a insightful debate of just what <span style="font-style: italic;">does</span> work better, worse, or just differently in the virtual world.<br /><br />I personally wouldn’t comment on Second Life, or any topic for that matter, unless I understood (not just heard about) the subject for real. At the very least one should have gotten off Orientation Island before saying boo. Then again, for many of these folks it's more about placing catchy brands in the title of a boring blog to spark readership.<br /><br />Anyway, since I’ve had the good fortune of getting the latest and greatest straight from the Second Life’s fearless leader and others, let’s get real. Second Life isn’t going to be huge. It already is. The reason Second Life can scale so wonderfully is because it in itself is not a web service, not a b2c business, not anything in the typical business sense. Rather, it is what I call a <span style="font-style: italic;">platform service</span>. Not an enterprise software platform that you download, install, and customize, a service. Like grid computing with a purpose. A platform with the flexibility to enable everyone else: people, companies, and marketplaces develop on their own terms inside the world. Some have said that Second Life is like eBay, an intermediary. In truth it’s actually one level of abstraction beyond this. An intermediary of intermediaries. Virtual “eBays” and just about any other type of business models are possible within the world.<br /><br />Second Life has attracted the interest of some serious players in the industry looking to do much more than kick the tires on this virtual trend. I’ve talked to execs at Cisco, Sun, and Amazon, and they all get it. The value lies with the residents- what companies like these and communities like ours will make of it. Don’t ask Philip or Ramzi at Linden what the value is going to be to residents, ask the residents. That is the whole point of Second Life. Imagine setting up an economic infrastructure of your own design, as if the United States or the rest of the world had yet to be incorporated. New and innovative business and economic models could never be instituted nor tested with such immediacy and precise feedback analysis in the real world as we know it. What can be done in Second Life that can’t be done in the real world? We decide.<br /><br />The growth in Second Life and at Linden Lab is staggering. And it is healthy growth, not dotcom bubble kind of growth. Second life is well positioned at the intersection of the new social computing society we live and the smart businesses that are shaping our economic and industrial future.<br /><br />Expect Linden Lab and Second Life to surprise you, perhaps when you least expect it. Based on how this company is performing both financially and virally, it may soon run up against the <a href="http://blogs.osafoundation.org/mitch/000593.html">problem</a> Google faced earlier in its growth trajectory that forced it to have an IPO sooner than planned. However, I might also believe that they could achieve some pretty impressive results with less than 1000 employees. Way less. At least at this stage, the platform strategy enables greater scaling prospects than Google, simply because they only need to focus on building out the virtual world framework vs. business and consumer applications, tools, etc. In fact, by the time they reach 200 employees (probably as soon as this fall) they will be a venerable force. They’ve definitely got growing pains. Keeping up with the growth is today’s number one challenge at Linden Lab-- the classic “good problem to have” here in Silicon Valley. If they can continue to attract world-class talent, scale their network infrastructure intelligently and quickly, and stay focused on the platform, the world will no doubt be hearing much more from Linden Lab and Second Life. Stay tuned. I’ve got more to share in future posts.<br /></span></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.comtag:blogger.com,1999:blog-3945587351669674556.post-50573187396396267902007-02-12T10:12:00.000-08:002007-04-03T12:07:55.332-07:00Tag, I'm It!Well I can't think of a better catalyst to get my blog project off the ground, than receiving a <a href="http://www.guidewireconnection.com/archive/2007/02/07/draft-not-ready-for-posting-tag-im-it-extending-the-game-of-blog-tag">Blog- Tag from Chris Shipley</a>, of <a href="http://www.demo.com/">DEMO</a> fame. I'm still waiting on the domain name of choice to come through, but I thought to myself, why wait? For those out of the know, the idea behind Pulver's <a href="http://pulverblog.pulver.com/archives/006087.html">Blog-Tag</a> is pretty simple. When you get tagged, you have to share five things most people probably didn't know about you, on your blog of course (yes, must have blog to play), then tag five others. From the looks of it, I'm in good company with Chris, and her tagged five. Check 'em out yourself. In honor of the brand, I'm modifying the rules to share six little known facts about me to introduce Sixorg. Read on...<br /><span class="fullpost"><br />1. As for Sixorg, well its been a long time coming. After years of badgering from my friends like <a href="http://ross.typepad.com/">Ross Mayfield</a> at <a href="http://www.socialtext.com/">Socialtext</a>, <a href="http://www.hubbub.typepad.com/blog/">Giovanni Rodriguez</a> - PR dynamo-guru-extraordinaire, and <a href="http://www.sifry.com/alerts/">Dave Sifry</a> over at Technorati way back in 2004, I've finally pulled it together to get my vocals online. Mostly so these guys and the rest of my crew don't have to listen to me rant and rave about the tech industry in person. Now they can deal with me in doses, or avoid me all together.<br /><br />Tag, tag, and tag, you three are it!<br /><br />2. Watch this space for several interesting things to come. I'm particularly interested in challenging conventional wisdom and conventional commentary on a range of hardcore topics including web search, user experience, product design, startups, and the truths, lies, and videotapes of the venture capital game. When I say hardcore, I mean sans the fluff, hype, and <a href="http://www.kraftfoods.com/koolaid/">Kool-Aid</a> that inebriates even the highest profile industry websites and blogs in action today.<br /><br />3. After building seven startups in the Valley, its time I join fray to separate the industry truths from the naive, the misinformed, and from those that haven't been there nor done that. Man, if those walls could only talk. There's a juicy best-seller in there just waiting to break free...someday.<br /><br />I'm also inspired by those that have paved the way into this game, including <a href="http://battellemedia.com/">John Battelle</a>, good friend to <a href="http://battellemedia.com/archives/001507.php">Groxis</a> back in the day, and <a href="http://www.venturebeat.com/">Matt Marshall</a> from the Merc and now VentureBeat.<br /><br />John, Matt, tag, tag you're both it!<br /><br />4. I am a closet architect (read: not licensed nor formally trained). I've designed 3 buildings that have actually gone into production as I like to say. One of which I live in today.<br /><br />5. Huge fan of The Police. Reunited for a 30th anniversary <a href="http://www.thepolicetour.com/">tour</a> in 2007. My Sting stories to follow, later in the year.<br /><br />6. I had a really inspiring conversation today with the founder of Linden Lab, <a href="http://blog.secondlife.com/">Philip Rosedale</a>, aka Philip Linden in Second Life. There is definitely way more to this story, and to Philip, than most people realize. More on this later in Sixorg, once I turn up my Second Life account.<br /><br />A real-time 'tag you're it' Philip!<br /><br />So there you have it. Six little known facts, six tags, and Sixorg. Chris Shipley, I owe you next round at the pub, it's all your fault.<br /><br />Welcome to the show, let's get on to the good stuff. Grok 'n roll.<br /><br />Cheers,<br /><br />-- R.J.<br /></span>R.J. Pittmanhttp://www.blogger.com/profile/14408821407918515948noreply@blogger.com