Content 2.0: The Future Of Web Search
Yahoo's Vice President of Product Strategy Bradley Horowitz explored the issues around search and shared his vision
of social search and its potential with the audience at Content 2.0 on 6th June 2006...
KEYNOTE: THE CHANGING FACE OF WEB SEARCH
Yahoo’s Vice President of Product Strategy Bradley
Horowitz explored the issues around search and shared his vision
of social search and its potential with the audience at Content
2.0 on 6th June 2006...
Report by Deirdre Molloy
[Register and post your own comments
on this article below...]
Download this session from the Content 2.0
Podcasts!
“I’m just a remixer” was Bradley’s opening gambit, “I’m taking a
lot ideas that are floating around Yahoo and floating around my
group”. He referenced Jamie Kantrowitz’s earlier comment that
Yahoo! are a Web 1.0 company re-tooling for Web 2.0. There’s a
lot of truth to that, Bradley noted, and it’s both a blessing
and a curse.
They’re blessed with the largest internet audience out there,
over half a billion people come to Yahoo! every month. But when
you think about something like Yahoo Mail and the throughput and
number of transactions they get on that and then you think about
changing that infrastructure to catch up with Web 2.0 – that’s a
daunting task and they’re faced with that at every turn.
We’re committed to opening up, Bradley stressed, it’s not just a
small bands of pirates running around trying to convince the
rest of Yahoo! to open – it’s the seasoned management of the
company. The founders, the senior level executives are mandating
that Yahoo! needs to open up. We can see the writing on the
wall, Bradley reasoned. We understand we need to do this, not
out of some altruistic generosity, but to become and remain a
viable business, he emphasised. Some of the people Bradley
wanted to credit who’s work and thoughts he was remixing
included: Tom Coates, Paul
Hammond, Simon Willison, Danah Boyd, Jeremy
Zawodny (Yahoo’s Robert Scoble), Caterina
Fake, Chad Dickerson, and others.
"We’re committed to opening up... we can see the
writing on the wall."
- Bradley Horowitz
The title of his talk was now changed from 'The Changing
Face Of Web Search' to ‘Better Search Through People’,
Bradley explained, adding that there’s no trademark on this
title ;-) and he proceeded to outline their strategy for social
search.
He showed a
pyramid diagram of community dynamics, such
as found in
Yahoo Groups – 1% creators, 10% synthesizers
and 100% consumers lurking or somehow benefiting from the top
11%’s activities. You can create viable strategies around this
model. Yahoo are trying to change this community dynamic so it
becomes 100% creators and synthesizers.
In order to achieve this first they need to lower the barriers
for entry to participation. Yahoo’s
Launchcast radio product does this very well,
Bradley reckoned, it gets better and better at predicting the
music I will like, and I do this for a selfish motive. But I can
also, with a simple tick-box, publish my own radio station as an
artefact to share with others. So the act of consumption is also
implicitly an act of production. I didn’t have to do anything
else to produce and author that content – it was something that
was created in the wake of my consumption. That’s the kind of
magic they’re looking for, he said.
He then went on to look at “surfacing great content” and played
a sequence of some of the top 100 photos rated for
“interestingness” on
Flickr – they are taken by amateurs but have
a powerful and professional impact. This is the best, he
reckoned, of what culture has to offer. And it’s also
user-discovered content.
When they discovered Flickr is was pointed out to them that
Yahoo! already has the world’s biggest and most successful photo
site - Yahoo Photos - so why were they sniffing around this
small Canadian company of just 10 employees? He highlighted 4
things that make Flickr special.
(1) The content is wonderful and all
user-generated
(Yahoo! have been in user-generated content for nearly a decade
so it’s well understood). In digression Bradley explained that
previously his speciality had been computer vision at MIT which
while exciting, is very difficult. It’s hard to throw a bunch of
pixels at an algorithm and have it bring back a meaningful
assessment of what’s going on in the image, and it’s going to
stay hard for a really long time. But, good news, people do this
really well, and there’s 6 billion of them – and that’s the
magic of Flickr. He cited the
ESP game built by Louis Van An from Carnegie
Mellon University, which, like Tetris, you don’t want to quit.
It’s for people who like word games, and the entire game is a
ruse to exchange labour for fun, they’ve collected 10 million
tags on images ion the internet and these tags are double-blind
quality assured2 people who can’t communicate agreed on the tag
– its really clever and very subtly embedded in the incentive
system of the game. When he saw that, a bell went off in his
head.
(2) User organised content
Content is tagged, described, organised and discovered not by
editors but by the users themselves. There is no rules, and by
lowering the barriers to entry and allowing tagging, 85% of the
images now have human-added data. There’s no spell check, you
can make up words and input whatever makes sense to you.
(3) The users themselves picked up Flickr
and distributed it around the internet
It integrates with popular blogging software. The terms of
service are such that if you click on a photo on a blog, it
takes you back to the Flickr site. It wasn’t business
development [or marketing – Ed] but the community that grew and
distributed Flickr. People began to use Flickr as the underlying
imaging infrastructure of their blogs. So Flickr began to appear
on literally 10s of thousands of sites without any business
development guy doing anything, the people did if for them. The
community distributed Flickr.
(4) User developed community – and user
developed functionality
The entire ecosystem was created by less than 10 employees –
aided by millions in the Flickr community. Flickr built Flickr
as a platform not just an end-user destination, so they
considered developers from the very inception of the product,
and wrote bindings for all the popular scripting languages –
PHP, Python, C++, Java – and let the developer community go at
it. And literally thousands of developers have built value
against the Flickr API, some very useful and utilitarian like
the
Macintosh Uploader. Again you couldn’t build
that on Yahoo! Photos. It’s just amazing the kinds of
possibilities that are unlocked when you have this huge archive
to timely, relevant, interested content, all richly annotated –
you can build cool things. He pointed us to
www.flickr.com/services where a bunch of them
are featured.
Returning to the idea of lowering barriers to participation,
Bradley explained that they’ve biased things to let people do
what they’re really good at, and use the algorithms for their
own purposes. They launched “interestingness” – which is based
on a heuristic of how often a photo is viewed, commented,
blogged, favourited – so harvesting people’s behaviour to
provide additional value. And it was done by the community not
by editorial staff, and in a way that is implicit not explicit
(that would have lead to all kinds of gaming of the system and
veered the product off in the wrong direction). It reflects
natural activity around the photos.
Bradley explained that when they introduced tagging the
librarians among them said it would never work. But the “query
disambiguation” system applies - whereby Flickr looks at the
co-occurrence of terms and breaks them down into their
constituent clusters. There is cluster-based analysis on any tag
and the clusters re-orientate in real time. That’s part of whet
they mean by better search through people. The data decides how
these clusters unfold.
"The difference between information and knowledge is
really that human factor and that’s what we’re all
about."
- Bradley Horowitz
The corporate vision of Yahoo! is to enrich people’s lives by
enabling them to find, use, share and expand (FUSE) all human
knowledge [as I heard Yahoo European Director of New Business
Rob Jonas articulate at the 25th Jan ALPSP event in London on
book digitization explain in more detail –
Ed] and they always ask at Yahoo! have you fused your product?
It’s a very different mission and vision from their competitors,
he noted, the words human and knowledge are in their vision –
the difference between information and knowledge is really that
human factor and that’s what they’re all about.
A potted history from Bradley followed: Yahoo was created by two
guys who dropped out of college and went to form a company that
organised the web for the benefit for the rest of us. This
scaled until 1995, the net exploded, technologies like Lycos and
Alta Vista etc arrived on the scene and bots started crawling
the web and people (ie. Webmasters) started gaming the search
system. Yahoo’s approach was editorial and authoritative, but it
didn’t scale. The next major innovation came when Google
introduced
Page Rank – a link-based analysis of a site’s
relevance in search queries. People do try to spoof this system
using giant link farms and other workarounds, Bradley admitted,
but it’s not easy.
Search still isn’t solved, but there are big issues with Page
Rank. If you think about query composition, about 25% is
informational, 40% is navigational (people are still using the
search box as a steering wheel to get around), and transactional
queries are less than 35% of all search queries, but
transactional is the most valuable, Bradley observed. It’s the
transactional ones that aren’t well served by today’s technology
as they’re subjective – do you know reputable plumber in London,
what political blogs are good? - and ironically they’re the most
valuable class of queries.
"Social search is about democratizing the process and
saying why can’t regular people contribute to that voting
process?"
- Bradley Horowitz
Page Rank is conferring upon webmasters the privilege of
deciding what’s right for all of us and whether you or I do a
query on IBM, we all get the votes that the webmasters have cast
by proxy for us and we get the same results. Social search is
about democratizing the process and saying why can’t regular
people contribute to that voting process, and further, where you
can ask people whom you trust what they recommend?
Social search is about democratizing the voting process and
taking it away from webmasters. Their new service
Yahoo Answers
allows you to get at information that doesn’t yet exist online.
The premise behind answers is that you can ask It relies on
community moderation and is a community marketplace.
The product actually came out of Korea where they had an
interesting problem – it has a small but highly wired
population, and a couple of years ago they had a problem in that
there was very few articles in Korean on the web. So
Naver created a platform called
KnowledgeIn and now they are the dominant
player and Yahoo! and Google are nowhere. They did it because
they were able to establish this knowledge marketplace and
capture knowledge and mindshare. Yahoo! copied that idea and
launched in Taiwan and the USA where it’s been very successful.
They introduced Yahoo Answers gradually – through the Flickr
audience initially – and began to snowball it, and now they have
search integration so it percolates through search
queries.
"Delicious unearths buzz and zeitgeist and Yahoo! can
use that as a tool to improve navigation and discoverability
across all our products"
- Bradley Horowitz
Then he turned to social bookmarking site
Delicious, which
sort of represents the top of the pyramid, he observed. They
have a small decentralized audience. It unearths buzz and
zeitgeist and Yahoo! can use that as a tool to improve
navigation and discoverability across all their products.
Another Yahoo! product is
My Web
which allows you to search, for instance, for London hotels and
see which ones my friends stayed in, in addition to the millions
of organic results, so it percolates the subjective into the
search process.
With time running out he took two questions from the audience,
the first of which was: why is it taking Yahoo so long to
monetise user-generated content? In regards to Flickr, ads are
displayed on the pages of all non-Pro account holders and
Bradley explained that in terms of the printing process they
don’t have a UK or any international printers external to the
USA yet, which is largely a function of the
non-internationalised third-party suppliers they use.
BT Group Director of Web Services
Sam Sethi
asked, given how well they’d integrated
microformats, when was Yahoo! going to buy
Technorati?
Bradley said they had no plans to but he was very supportive of
their work in the area of microformats, and that geo-tagging is
also something that would be good within Flickr and something
else to look forward to.
Content 2.0 - 2006 conference Website:
http://www.content2point0.com/2006/
About Bradley Horowitz:
Bradley Horowitz, Vice President of Product Strategy, is
responsible for leading Yahoo!’s efforts in building innovative
search technologies. Bradley’s expertise helps drive initiatives
that enable the company to provide comprehensive and compelling
offerings to customers. Previously he managed a portfolio of
products for Yahoo!, including media search, desktop search and
the Yahoo! Toolbar. Prior to joining Yahoo!, Bradley served as
both the chief technical officer and the vice president of
engineering for the Virage division of Autonomy, where he was
responsible for the technical delivery of five major product
lines. Prior to Autonomy, he founded Virage, the company widely
recognized as the market creator and leader for advanced media
indexing and analysis. He helped grow the company from “a garage
startup” through its NASDAQ IPO. Bradley was a PhD candidate at
the MIT Media Lab. While at the Media Lab, he worked on a number
of topics related to computer vision, graphics and image
processing, which resulted in a patented new technique for the
recovery of structure, motion and camera parameters from video
sequences. Bradley holds a MS in Media Science from MIT and a BS
in Computer Science from the University of Michigan. He blogs at
http://www.elatable.com/blog/
OTHER CONTENT 2.0 SESSION REPORTS
Content 2.0: Mesh Up - Connecting Content To
People
Content 2.0: Goodbye New Media Hello Social
Media
Content 2.0: Marketing 2.0 Forum
Content 2.0: Can Brands Be Trusted?
Content 2.0: Folksonomies - What Are They Good
For?
Content 2.0: Search & Enjoy Forum
Content 2.0: The Invisible Culture
Beers & Innovation (music special) @ Content
2.0
Comments
You must be logged in to comment.