Add some information to the README.

This commit is contained in:
Nick Thomas
2013-08-02 20:17:12 +01:00
parent edb558a88d
commit 08604f718b

147
README.md
View File

@@ -1,2 +1,149 @@
hide-eid
========
A suite of tools for hiding Endpoint IDs (IPv4/IPv6 addreses) from intermediate
participants in the Internet.
Overview
--------
As the Location/Identity Separation Protocol people have noted, IPv4 and IPv6
both use IP addresses (Endpoint IDs, in the lingo) to make routing decisions.
Focusing on this from the point of view of routing table efficiency / features,
they note that this is sub-optimal, and that intermediate routers do not need
the EID to make routing decisions, if an alternative (routing locater, or RLOC)
is present in the packet.
What seems to have gone unnoticed is that there are privacy implications here.
As recent PRISM, XKeyscore and other disclosures have shown, these intermediates
are complicit in, or at least vulnerable to, attacks by national security
agencies, who take advantage of the visibility of these EIDs to construct
comprehensive logs of who is talking to who; even if the content of the
communication is encrypted, the simple fact that an individual has communicated
something somewhere may be enough for these agencies to justify taking futher
action against them.
Since the EID is not needed by these intermediaries, it makes sense to stop
giving it to them as quickly as is possible. These tools aim to implement a
simple scheme that achieves this goal, with speed / ease of implementation, and
the ability to scale to traffic levels of ~40Gbps (for small to medium ISPs).
Removing knowledge of the EID from them means that any affected traffic enjoys
the status of being in an anonymity set that is as large as the number of people
who share the same RLOC. In this scheme, I assume that this is an access ISP, on
one end of the path; and a hosting ISP, on the other end. This provides typical
anonymity sets of between a few hundred to a few million individuals.
To remove these EIDs, we need to start by creating an EID-to-RLOC map. A first
pass for this is an /etc/hosts equivalent; a second pass could be a DNS node
(like ip6.arpa); and a third pass might be using BGP transient attributes or a
proper LISP system. This allows access and hosting ISP to discover which RLOC
they should use for any given destination EID.
The registry also contains a public key that is to be used to encrypt the parts
of any passed packet that are sensitive. Assuming the protocol being run over IP
is encrypted (HTTPS or SSL SMTP, for instance), this may just be the IP+TCP/UDP
header, or it may be the whole packet.
When the access ISP receives a packet from its subscriber with a destination IP
that is present in this registry, it encrypts the relevant portion of the packet
with the public key, then encapsulates the packet in an IP header of its own.
This IP header has the RLOC for the wrapping ISP as the source IP, and the RLOC
obtained from the registry as the destination IP. The wrapped packet is then
forwarded for routing to the destination.
Since the RLOC is just an IP address, and one controlled by the destination ISP
at that, the route the wrapped packet takes through the internet will be about
the same as if the packet had never been wrapped. This is a large advantage of
the scheme over onion routing, such as tor; no significant latency is added.
When received by the destination ISP, it can use its private key to decrypt the
encapsulated packet, and send that decrypted packet on to its final hop.
LIMITATIONS
-----------
You have to trust two ISPs.
Certainly for access ISPs, even with the best will in the world, the
infrastructure between them and their layer 1/2 service providers may be bugged.
This is not protected against by this scheme; if you suspect this is happening
to your ISP without their knowledge, you can run IPSec over the link and allow
them to terminate it just before the box that wraps the packets. If you suspect
it is happening with their knowledge, the best you can do is change ISP. If we
run out of good ISPs, this scheme adds nothing.
If the other side of the link is complicit, this scheme does nothing. It isn't
going to stop Facebook from handing all their records of your accesses to them
over to the NSA. Stop using Facebook.
Public-key encryption is relatively slow compard to block ciphers; making this
scale is going to be a challenge. Hopefully not impossible - but if it's too
expensive, uptake will be low or zero. If it's too unreliable, uptake will be
low or zero.
May break ICMP and other responses from intermediate ISPs. Path MTU discovery
breaks, for instance, with a naive implementation of this scheme, as does
ICMP tracerouting (this can be fixed, especially in IPv6 - see _ICMP_).
SELLING POINTS
--------------
Uptake can be low (but not zero) and significant benefits are still seen. Even
if just two ISPs take up the scheme, one access and one hosting, everyone
who uses the access ISP is now anonymous for any of their traffic that goes to
the hosting ISP. Privacy-conscious individuals can take note of that and move
to those ISPs, or tunnel their traffic to them, to regain their anonymity.
Faster than Tor. Especially in the latency stakes.
Requires no CPE changes. This killed IPv6 uptake for a decade - end users are
not easy to upgrade. L/ISP schemes typically require the holder of the EID to
be in charge of looking up and using RLOCs; this scheme does not need that.
Probably stateless. Putting the encrypted EIDs into the packet we send means
that the source and destination ISPs don't need to perform connection tracking.
This isn't NAT in the traditional sense.
Since both source and destination enjoy a large anonymity set, this scheme is
resilient to correlation attacks. An earlier revision only encrypted the
source EID, which was vulnerable to trivial attacks of that nature.
ICMP
----
As noted, a naive implementation breaks ICMP responses by intermediaries. This
is a result of the design; as they no longer know the EID, they can't send an
ICMP response of any sort to it. As IPv6, in particular, relies on ICMP for
protocol features such as path MTU discovery, this is something of a problem.
One solution to this is to have a wide range of RLOC IP addresses, and use them
in a round-robin manner to maintain a local map of RLOC -> real source IP, which
would be retained for a short period (some seconds, say). If an ICMP reply is
received, directed to one of these RLOCs, the EID can be looked up from the map
and the packet can be rewritten with it and forwarded appropriately. Of course,
this is state for the ISP to keep track of, and monopolises a segment of the IP
address space. That last is not a problem in IPv6, but will prevent its use in
almost all IPv4 deployments. Fortunately, IPv4 is legacy, and doesn't strictly
require ICMP, in the same way IPv6 does.
WHY?
----
It's my position that anonymity is only necessary in the presence of oppression.
In the absence of oppression, anonymity primarily facilitates crime and wrong-
doing. When present, it continues to do that, but also provides a means of
escaping oppression, and defeating oppressors.
Are we in an oppressive society? Do oppressive societies exist? I believe the
answer to both of those questions is yes. I wish it were otherwise.
AUTHOR
------
Name : Nick Thomas
Handle : lupine
Web : lupine.me.uk
SMTP : nick@lupine.me.uk
XMPP : nick@lupine.me.uk