From 08604f718b790cf715e414c2fa4f28bdb44e65eb Mon Sep 17 00:00:00 2001
From: Nick Thomas <nick@lupine.me.uk>
Date: Fri, 2 Aug 2013 20:17:12 +0100
Subject: [PATCH] Add some information to the README.

---
 README.md | 147 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 147 insertions(+)

diff --git a/README.md b/README.md
index 55de604..9ca2141 100644
--- a/README.md
+++ b/README.md
@@ -1,2 +1,149 @@
 hide-eid
 ========
+
+A suite of tools for hiding Endpoint IDs (IPv4/IPv6 addreses) from intermediate
+participants in the Internet.
+
+Overview
+--------
+
+As the Location/Identity Separation Protocol people have noted, IPv4 and IPv6
+both use IP addresses (Endpoint IDs, in the lingo) to make routing decisions. 
+Focusing on this from the point of view of routing table efficiency / features,
+they note that this is sub-optimal, and that intermediate routers do not need
+the EID to make routing decisions, if an alternative (routing locater, or RLOC)
+is present in the packet.
+
+What seems to have gone unnoticed is that there are privacy implications here.
+As recent PRISM, XKeyscore and other disclosures have shown, these intermediates
+are complicit in, or at least vulnerable to, attacks by national security
+agencies, who take advantage of the visibility of these EIDs to construct
+comprehensive logs of who is talking to who; even if the content of the
+communication is encrypted, the simple fact that an individual has communicated
+something somewhere may be enough for these agencies to justify taking futher
+action against them.
+
+Since the EID is not needed by these intermediaries, it makes sense to stop
+giving it to them as quickly as is possible. These tools aim to implement a
+simple scheme that achieves this goal, with speed / ease of implementation, and
+the ability to scale to traffic levels of ~40Gbps (for small to medium ISPs). 
+
+Removing knowledge of the EID from them means that any affected traffic enjoys
+the status of being in an anonymity set that is as large as the number of people
+who share the same RLOC. In this scheme, I assume that this is an access ISP, on
+one end of the path; and a hosting ISP, on the other end. This provides typical
+anonymity sets of between a few hundred to a few million individuals. 
+
+To remove these EIDs, we need to start by creating an EID-to-RLOC map. A first
+pass for this is an /etc/hosts equivalent; a second pass could be a DNS node
+(like ip6.arpa); and a third pass might be using BGP transient attributes or a
+proper LISP system. This allows access and hosting ISP to discover which RLOC
+they should use for any given destination EID.
+
+The registry also contains a public key that is to be used to encrypt the parts
+of any passed packet that are sensitive. Assuming the protocol being run over IP 
+is encrypted (HTTPS or SSL SMTP, for instance), this may just be the IP+TCP/UDP
+header, or it may be the whole packet. 
+
+When the access ISP receives a packet from its subscriber with a destination IP
+that is present in this registry, it encrypts the relevant portion of the packet
+with the public key, then encapsulates the packet in an IP header of its own.
+This IP header has the RLOC for the wrapping ISP as the source IP, and the RLOC
+obtained from the registry as the destination IP. The wrapped packet is then
+forwarded for routing to the destination.
+
+Since the RLOC is just an IP address, and one controlled by the destination ISP
+at that, the route the wrapped packet takes through the internet will be about
+the same as if the packet had never been wrapped. This is a large advantage of
+the scheme over onion routing, such as tor; no significant latency is added. 
+
+When received by the destination ISP, it can use its private key to decrypt the 
+encapsulated packet, and send that decrypted packet on to its final hop.
+
+
+LIMITATIONS
+-----------
+You have to trust two ISPs.
+
+Certainly for access ISPs, even with the best will in the world, the
+infrastructure between them and their layer 1/2 service providers may be bugged. 
+This is not protected against by this scheme; if you suspect this is happening
+to your ISP without their knowledge, you can run IPSec over the link and allow
+them to terminate it just before the box that wraps the packets. If you suspect
+it is happening with their knowledge, the best you can do is change ISP. If we
+run out of good ISPs, this scheme adds nothing.
+
+If the other side of the link is complicit, this scheme does nothing. It isn't
+going to stop Facebook from handing all their records of your accesses to them
+over to the NSA. Stop using Facebook. 
+
+Public-key encryption is relatively slow compard to block ciphers; making this
+scale is going to be a challenge. Hopefully not impossible - but if it's too
+expensive, uptake will be low or zero. If it's too unreliable, uptake will be
+low or zero.
+
+May break ICMP and other responses from intermediate ISPs. Path MTU discovery
+breaks, for instance, with a naive implementation of this scheme, as does
+ICMP tracerouting (this can be fixed, especially in IPv6 - see _ICMP_).
+
+
+
+SELLING POINTS
+--------------
+Uptake can be low (but not zero) and significant benefits are still seen. Even
+if just two ISPs take up the scheme, one access and one hosting, everyone
+who uses the access ISP is now anonymous for any of their traffic that goes to
+the hosting ISP. Privacy-conscious individuals can take note of that and move
+to those ISPs, or tunnel their traffic to them, to regain their anonymity.
+
+Faster than Tor. Especially in the latency stakes.
+
+Requires no CPE changes. This killed IPv6 uptake for a decade - end users are
+not easy to upgrade. L/ISP schemes typically require the holder of the EID to
+be in charge of looking up and using RLOCs; this scheme does not need that.
+
+Probably stateless. Putting the encrypted EIDs into the packet we send means
+that the source and destination ISPs don't need to perform connection tracking.
+This isn't NAT in the traditional sense. 
+
+Since both source and destination enjoy a large anonymity set, this scheme is
+resilient to correlation attacks. An earlier revision only encrypted the
+source EID, which was vulnerable to trivial attacks of that nature.
+
+
+ICMP
+----
+As noted, a naive implementation breaks ICMP responses by intermediaries. This
+is a result of the design; as they no longer know the EID, they can't send an
+ICMP response of any sort to it. As IPv6, in particular, relies on ICMP for 
+protocol features such as path MTU discovery, this is something of a problem.
+
+One solution to this is to have a wide range of RLOC IP addresses, and use them
+in a round-robin manner to maintain a local map of RLOC -> real source IP, which
+would be retained for a short period (some seconds, say). If an ICMP reply is
+received, directed to one of these RLOCs, the EID can be looked up from the map
+and the packet can be rewritten with it and forwarded appropriately. Of course,
+this is state for the ISP to keep track of, and monopolises a segment of the IP
+address space. That last is not a problem in IPv6, but will prevent its use in 
+almost all IPv4 deployments. Fortunately, IPv4 is legacy, and doesn't strictly
+require ICMP, in the same way IPv6 does.
+
+
+WHY?
+----
+It's my position that anonymity is only necessary in the presence of oppression.
+In the absence of oppression, anonymity primarily facilitates crime and wrong-
+doing. When present, it continues to do that, but also provides a means of
+escaping oppression, and defeating oppressors.
+
+Are we in an oppressive society? Do oppressive societies exist? I believe the
+answer to both of those questions is yes. I wish it were otherwise.
+
+
+AUTHOR
+------
+Name   : Nick Thomas
+Handle : lupine
+Web    : lupine.me.uk
+SMTP   : nick@lupine.me.uk
+XMPP   : nick@lupine.me.uk