Traffic analysis is the first prerequisite for mass surveillance of the Web

George Orwell’s depiction of dystopia in his classic 1984, a society devoid of privacy, may have seemed like an exaggeration in 1949. But, with technology intruding deep into our lives today, we may actually be heading into a less obvious version of a similar state.

Jacob Appelbaum, prolific hacker and a close associate of WikiLeaks founder, Julian Assange, portrayed a grim picture of the future of the Internet when he was in Bangalore this week. He and the group of international hackers he is part of, Cypherpunks, propagate the idea of using cryptography on the Internet to counter surveillance and censorship on the Internet.

Traffic analysis

“Censorship is the byproduct of surveillance,” wrote Mr. Appelbaum in the book Cypherpunks: Freedom and Future of the Internet, which he has co-authored with Mr. Assange. For mass surveillance of the Internet, the first prerequisite is ‘traffic analysis’, a systematic logging of user activity and building profiles of users.

Features such as targeted advertisements are apparently more benign manifestations of traffic analysis.

Gmail and Facebook, for instance, analyse user traffic and activity, and deliver targeted ads, which is a boon to commerce on the Internet. But, the method adopted raise concerns about privacy, and in many cases can be considered an intrusion. For instance, when Google displays ads about pizza parlours after you have read a mail from a friend mentioning the word “pizza”, it does make one think as to what else Google might know about users.

Google and Facebook are able to identify and profile users because he or she is logged into their services, and have voluntarily identified themselves and have signed away the rights. This makes it easy for these Internet giants to log user activity, attribute it to the users, run social graph programs and build a comprehensive profile of users. While Google and Facebook require users to be logged into their service to profile them, it is possible to monitor users simply by analysing the traffic emanating from their Internet Protocol (IP) addresses. This is commonly known as traffic analysis and is the first prerequisite for surveillance.

IP addresses are numbers which can be attributed to people using the Internet; think of IP addresses in the Internet as vehicle registration numbers in the real world. When the IP addresses are monitored for activity, they reveal information about users. Concealing IP addresses is thus the first level of anonymity for users.

Data retention

Browsing the Web without the necessary precautions such as using unencrypted browsing, is akin to sending a postcard by mail instead of using envelopes. The data on unencrypted links can be seen transparently while they are being travelling from source to destination, with very little technical effort.

The solution is to encrypt data before transmission, and then decrypt it at reception using complex encrypting algorithms. The data after encryption, if intercepted at one of the many hop points, would be harder to decipher because of encryption.

Concealing one’s IP address and securing the data sent on these transparent links can help users stay anonymous and also circumvent censorship in most cases. Websites such as can reveal the geographic location of the user. Based on the IP addresses, traffic of users are subject to different types of censorship.

The Onion Router (TOR) network, is one of the most widely used solutions. The TOR browser bundle transports users on to a highly anonymous Internet network that also protects user identities. TOR is a free software project and the network itself is run by volunteers in a peer-to-peer model. The nodes of these networks that ‘route’ traffic into and out of the TOR network are called TOR Relays.

So, how does TOR help?

Let us say, that a user in Iran connects to the Internet directly and the Iranian Internet filters block access to many websites because the IP address is known to originate from Iran. But, when the user accesses the Web using TOR, the traffic of the user will get routed through the peer-to-peer network of users, instead of the Internet service provider network. The traffic of the user might get out through a TOR exit node in Germany. Since, these filters are not functional in Germany, the user can access the blocked content in Germany. Also, the traffic when sniffed from one relay point to the next it will not be able to reveal user information or the data because it is encrypted.

TOR serves as an antidote to censorship and also protects the identity of the user as the traffic takes multiple hops and the chances of revealing user information exists only at the entry and exit nodes.

This reduces the risk of exposing the identity of the user and is a way to subvert censorship.