Privacy Leapfrog: Keeping Internet Search Data

December 23, 2008

search enginesLast week, Yahoo launched the latest salvo in a public battle over privacy when it announced it will purge its internet search logs within 90 days.

The announcement came after pressure from European regulators prompted Google in September to cut the time it takes to make internet search data anonymous from 18 months to 9 months. Because, Google said, 9 months was a good balance between “sometimes conflicting factors like privacy, security and innovation.”

The tit for tat over who could best protect consumer privacy began earlier this year. In March, Google announced it would “anonymize” the logs after 18 months. In July, Yahoo said it would put in place a 13-month purge policy.  In September, Google cut the time to 9 months.

What’s the big deal? Why keep the data at all?

Google explained it this way in a 2007 blog to posted for users:

  • Improve the search algorithm to provide better searches
  • Defend its systems from malicious access and exploitation
  • Fight fraud against advertisers;
  • Respond to valid legal orders from law enforcement as they investigate and prosecute crimes; and,
  • Comply with the legal obligations of data retention.

In order to find new search techniques and evaluate whether users find them useful, Google says it has to store and analyze internet search logs. It offered this detailed explanation to the European Union.

“What do people click on? How does their behavior change when we change aspects of our algorithm,” Google says in a blog post to users. “Using data in the logs we can compare how well we’re doing now at finding useful information for you to how we did a year ago. If we don’t keep a history, we have no good way to evaluate our progress and make improvements.”

Yahoo also cited protecting users and its business partners from fraud.

Both search engines say that it helps them send the most appropriate advertising your way.

While each of the major search engines have policies in place to partially or completely delete the IDs from the search logs between 90 days and 9 months, what’s deleted, what remains and how it’s used differs.

Currently, Microsoft deletes the entire IP address while Google and Yahoo delete only the last eight bits. Assigned the internet service provider, the truncated IP address can identify the country, region and in some cases the city of a computer and the internet service provider.

Privacy advocates say deleting the last eight bits of information is inadequate.

At present, there is no agreement over what data should be erased and how these policy affect consumers. Though they delete all or part of an IP address, some search engines maintain other information that can be used to aggregate information and potentially identify users.