GooglePrivacy
Search Engine Privacy Tool

paper | poster



Abstract

This project is aimed at addressing the problem of privacy on the internet. In particular, it tries to solve the problem that has been created by Google’s dominance of the search engine field. Whenever a user types a search query into Google’s search engine, Google stores all the data it receives. This data includes the cookie ID, internet IP address, time and date of the search, the search terms, and the browser configuration. Because Google provides numerous services, users enter sensitive data such as their phone contacts, street addresses, news preferences, dynamic web content, images, videos, blogs, library preferences, job related activities etc. into Google. Since Google retains all this data indefinitely, there is a large risk to an individual’s privacy if this data gets into the wrong hands.

The system described in this document substantially reduces the privacy risk faced by an individual when using the Google search engine. The GooglePrivacy system allows users to utilize the full power of the Google search engine without having to give up any personal data. This is achieved primarily by routing user queries through a proxy server specially configured to remove all sensitive user data.

Every query entered into the GooglePrivacy web page is sent to the GooglePrivacy server instead of directly to the Google server. In turn, the GooglePrivacy server removes all sensitive user information and forwards the query to the Google server using its own IP address and settings. The search results page is then filtered to remove all active content placed by Google and returned back the user. Additionally, all links on the results page are modified to ensure that all further queries are routed through the GooglePrivacy server instead of through the Google Server.

As a result, Google records the GooglePrivacy server’s actions as opposed to the user’s actions. Since the GooglePrivacy server is shared by multiple users making a wide range of queries, GooglePrivacy prevents Google from mining data on a single individual. The random query feature adds additional complexity to the task of mining individual data. Thus GooglePrivacy achieves the goal reducing the risk of user privacy generated by Google.




Author: Ahsan A. Siddiqui
Faculty Advisor: Jonathan M. Smith