

I was curious about how people used the internet. Specifically,
I wanted to see how internet behavior changed over the course of
a day. Search engines are the gateway to the internet for most people,
and so search queries provide insight into what people are doing
and thinking. I had several assumptions before I started:
- Overall, internet usage is highest during the day, tapers off
at night, and reaches a lull in the early morning hours.
- People search for information during the workday (8-6ish)
- People socialize or look for information of personal interest
when they get home from work. (6ish to midnight)
- People look for entertainment (often of the sexual variety)
late at night and into the wee hours. (midnight-6am)
I was curious to see if data from search engines would support
my anecdotal observations. I built a simple clock-like visualization
that displays the top search terms over a 24-hour period. Displaying
search terms in a cyclical layout (like a clock) allows continuous
examination of trends that would otherwise be broken up. The data
I had access to was both large and noisy. In response, I combined
hourly data into week or year averages. All search strings were
broken up into single words (period, commas and similar were considered
whitespace as well). This helped pool frequent terms, and better
illuminate search motivation (e.g. “information about taxes”
and “information about chinchillas” counted as two hits
for "information"). The top five search terms were shown
for each hour, sized to reflect their relative frequency (larger
= more popular). A list of stop words was developed to eliminate
uninteresting terms (e.g. that, for, an, not, free). I have not
modified the data in any way – you see it as it is.
Some might be wondering if international users in different time
zones impacted the search distribution. This is probably true. However,
my guess is that most users were based in North America (especially
for Magellan in the late 90s and AOL in general). The data seems
to support this as well, with search activity slowing down at night
(western hemisphere time).
I ran the visualization with two unique data sets:
Special note: this page was slashdotted
on March 2nd, 2007.
Go to Home Page
Go to Projects Page
|