Magellan
Voyeur Data Visualization
Magellan, search engine of yesteryear, offered a service called
Voyeur, which displayed the last 10 search queries. Brian Amento
of AT&T Labs archived this data in 10-minute intervals from
1997 to 2001. There are gaps in the data set from outages and changes
to the Voyeur service. However, these events are assumed to be random,
and thus have little impact on the distribution of search terms.
Furthermore, because the data spanned a four-year period, I combined
hourly data into yearly averages, which further helped to compensate
for gaps and noise.
This data set is interesting for a few reasons. Foremost, it is
more than decade old. People were searching for different things
back then, and it shows. Secondly, the data spans a multi-year period,
which helps exaggerate overarching trends. Lastly, and perhaps most
importantly, Magellan was used to search for a variety of content
by a diverse user group (including people at work, unlike the AOL
data set).
Notes:
The inner most ring is the average for 1997. Rings then work outward
one year at a time until 2000. 2001 was not included because only
a fraction of the year was collected. The size of the font is a
linear relationship with the number of times the term appeared in
that hour (e.g. 100 hits = Courier size 100). Time is EST.
Interesting Trends:
I could explain every trend for you, but half the fun is exploring
the data! For those who are lazy, here are some major (and obvious)
trends to get you started: Overall
- There appears to be a dramatic shift away from chat and towards
information retrieval between 1997 and 2000.
- People are diurnal - search activity dies down at night and
picks up again as people get up for work.
1997
- It is clear that chat is most prevalent when people are home
(evening). You can see chat frequency starting to grow around
11am, dominating by 5pm, and tapering off around 1am. It is supplanted
by sex around 5am.
- It seems people are curious about adult topics throughout the
day. You can see sex jump in frequency around 11pm, reaching a
climax around 2am (no pun intended) and dying down to nominal
levels by 5am. However, since everyone is in bed, it clings to
the top spot until pictures jumps to life, snatching the top spot
as people roll out of bed.
- Secondary terms are interesting as well. Entertainment oriented
terms are popular in the afternoon and evening. University and
software make their main appearance during the work day (8am-5pm).
Warez makes it into the top five from 5am-7am thanks to late-night
pirates and people who can’t get to sleep.
1998
- Chat and pictures vie for the top spot starting around 5pm,
continuing until 2am. However, mp3s (and download) make a strong
appearance, especially at night.
1999 & 2000
- These two years are similar, and so I've grouped them for brevity.
The data shows chat, mp3s and porn begin to lose out to information,
which dominates around the clock. MP3 remain popular in 1999.
By 2000, e-commerce has matured; people are increasingly searching
for things to buy.
Special note: this page was slashdotted
on March 2nd, 2007.
Return to SearchClock Homepage
Go to Home Page
Go to Projects Page
|