You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Nick Thomas 4c8afca730 Some clarity 7 years ago
pass-1 Some clarity 7 years ago
.gitignore First pass at fragmenting 7 years ago
LICENSE Second night's commit. 7 years ago
README.md Markdown != asciidoc 7 years ago

README.md

hide-eid

A suite of tools for hiding Endpoint IDs (IPv4/IPv6 addreses) from intermediate participants in the Internet.

Overview

As the Location/Identity Separation Protocol people have noted, IPv4 and IPv6 both use IP addresses (Endpoint IDs, in the lingo) to make routing decisions. Focusing on this from the point of view of routing table efficiency / features, they note that this is sub-optimal. Intermediate routers do not need the EID to make routing decisions if an alternative (routing locater, or RLOC) is present in the packet. More information about this idea can be found in RFC 6830, here.

What seems to have gone unnoticed is that there are privacy implications here. Recent PRISM, XKeyscore and other disclosures have shown that intermediaries can be complicit in, or at least vulnerable to, attacks by national security agencies, among others, who take advantage of the visibility of these EIDs to construct logs of who is talking to whom. Even if the content of the message is encrypted, the simple fact that an individual has communicated something to an identifiable destination may be enough for these agencies to justify taking further action against them - targeted surveillance, for instance.

Since the EID is not needed by these intermediaries, it makes sense to stop giving it to them as quickly as is possible. These tools aim to implement a simple scheme that achieves this goal, with speed / ease of implementation, and the ability to scale to traffic levels of around 40Gbps (for small to medium ISPs), in mind.

If the source and destination EIDs are not sent in the clear, and the payload is encrypted, then the only identifiying information intermediaries have is the source and destination RLOCs. For a HTTPS session between an individual and a website, this could be a few tens of thousands of internet subscribers on the one side, and a few thousand servers running websites on the other other.

To remove these EIDs, we need to start by creating an EID-to-RLOC map. A first pass for this is an /etc/hosts equivalent; a second pass could be a DNS node (like ip6.arpa); and a third pass might be using BGP transient attributes or a proper LISP system. This allows access and hosting ISP to discover which RLOC they should use for any given destination EID.

The registry also contains a public key that is to be used to encrypt the parts of any passed packet that are sensitive. Assuming the protocol being run over IP is encrypted (HTTPS or SSL SMTP, for instance), this may just be the IP+TCP/UDP header, or it may be the whole packet.

When the access ISP receives a packet from its subscriber with a destination IP that is present in this registry, it encrypts the relevant portion of the packet with the public key, then encapsulates the packet in an IP header of its own. This IP header has the RLOC for the wrapping ISP as the source IP, and the RLOC obtained from the registry as the destination IP. The wrapped packet is then forwarded for routing to the destination.

Since the RLOC is just an IP address, and one controlled by the destination ISP at that, the route the wrapped packet takes through the internet will be about the same as if the packet had never been wrapped. This is a large advantage of the scheme over onion routing, such as tor; no significant latency is added.

When received by the destination ISP, it can use its private key to decrypt the encapsulated packet, and send that decrypted packet on to its final hop. Return traffic undergoes the same treatment, of course.

Usage

Pass 1 now exists, in a rudimentary form. Here's how to put together a couple of hide-eid endpoints that can talk to each other.

First, you need two machines - one is the source, the other the destination. Both should have an IPv4 address routed to them that is not claimed on the machines themselves. These will be your RLOCs. They should be globally routeable! Public IPs, in other words.

On each machine, you'll also need a range of IPs. These will be your EIDs. They need to be globally unique only within the context of the EID-to-RLOC registry maintained by this project, for now - they can even be RFC1918 space, as long as there are no overlaps within this registry. Remember, EIDs aren't used to make routing decisions across the Internet..

Generate some ECC private keys, and their public components, in PEM format:

$ openssl ecparam -genkey -out rloc1.private.pem -name secp160r2
$ openssl ec -in rloc1.private.pem -pubout -out rloc1.public.pem
$ openssl ecparam -genkey -out rloc2.private.pem -name secp160r2
$ openssl ec -in rloc2.private.pem -pubout -out rloc2.public.pem

Add entries to the rloc-registry.json file to reflect your mappings. You need to add an entry (a JSON object) to the “eid_rloc_map” array, like this:

{ "family":"ipv4", "network":"10.0.0.0", "netmask":8, "rloc":"1.2.3.4"}

IPv6 support isn't in yet. Once it is, IPv4-in-IPv6 and vice-versa mappings will be permitted.

You also need to add an rloc:pubkey mapping to the “keys” object. Make sure it's not the private key! Also, remember to add all the EID mappings and RLOCs you want, not just one.

Then, on each machine:

$ cd pass-1
$ make all
host1$ ./hide-eid rloc-registry.json eid0 eid0 <rloc1> <rloc1>.private.pem
host2$ ./hide-eid rloc-registry.json eid0 eid0 <rloc2> <rloc2>.private.pem

You'll notice quite a lot of uninteresting output; it's wordy for all the wrong reasons at the moment. Of particular note are a wide range of TODOs.

One of those TODOs is bgp/etc support for route injection. Since it's not done yet, you need to add the routes yourself:

host1$ ip route add <eid-range-for-rloc-2> dev eid0
host2$ ip route add <eid-range-for-rloc-1> dev eid0

Also, make sure that an EID from each range is routable on the respective machines. For testing, I just did:

host1$ ip addr add <eid-ip-for-rloc-1> dev eid0
host2$ ip addr add <eid-ip-for-rloc-2> dev eid0

The short version is that traffic to and from those EIDs must go into the TUN device controlled by hide-eid for it to do the magic.

At this point, you should be able to ping from host1, and vice-versa, and get an ICMP echo reply back. You can also run TCP or UDP servers on one of the IPs, and connect to them from the other IP. If you run wireshark or tcpdump on an intermediate machine (or just one of the hosts, if you focus on the egress/ingress traffic) you'll see obscure IP packets with just the RLOC addresses as source and destination, and no visible UDP/TCP headers. IP Protocol is set to 99 - “any private encryption scheme”.

Encryption

Encryption scheme is really the only novel portion of this project; the rest is covered in the L/ISP RFCs. This code is all about slapping together a basic L/ISP router (badly), and implementing cryptography for the encapsulated IP packets, for the sake of experimenting. Crypto is hard, and experimentation is key (ha ha).

Current scheme

This seems less stupid.

  • EC public keys in central repository
  • Each participant knows only their private key
  • Generate ECDH secret for each peer using their public + your private key
  • pseudo-random 128-bit IV per-packet, put at the start of encrypted data
  • Use as256gcm symmetric encryption with sha256( ecdh ) to encrypt / decrypt

Main point is that routers don't need to communicate with each other to negotiate a shared key - they can independently derive the same asymmetric key as long as they share some common assumptions, have their own private key, and the peer's public key.

Asymmetric key size is smaller, and we're moving to a symmetric cipher for the actual packet encryption, so hopefully this will be much faster than scheme 0.

Which curve should we be using? No clue. What size of key should we be using? No clue. Is this kind of shared key appropriate when we're passing considerable traffic? No clue.

Scheme 0

This was stupid.

  • RSA public keys in central repository
  • Just use public key to directly encrypt packet data
  • Use private key to decrypt packets addressed to you.

This is slow, and you can only encrypt data that's smaller than the key modulus, or something like that.

First result: rtt increased from 37ms to 80ms.

For access<->hosting, that kind of latency increase is bad, but bearable. For hosting<->hosting, it's completely unacceptable.

Not all of it may be crypto-related - worth implementing a no-op branch that just encapsulates, and checking the difference.

Limitations

You have to trust two ISPs.

Certainly for access ISPs, even with the best will in the world, the infrastructure between them and their layer 1/2 service providers may be bugged. This is not protected against by this scheme; if you suspect this is happening to your ISP without their knowledge, you can run IPSec over the link and allow them to terminate it just before (or on) the box that wraps the packets. If you suspect it is happening with their knowledge, the best you can do is change ISP. If we run out of good ISPs, this scheme adds nothing. You can always start a VPN ISP.

If the other side of the link is complicit, this scheme does nothing. It isn't going to stop Facebook from handing all their records of your accesses to them over to the NSA. Stop using Facebook.

There are four cryptographic operations in each trip - encrypt outgoing packet, decrypt outgoing packet, encrypt return packet, decrypt return packet. This is going to be slower than no crypto. Too slow?

May break ICMP and other responses from intermediate ISPs. Path MTU discovery breaks, for instance, with a naive implementation of this scheme, as does ICMP tracerouting (this can be fixed, especially in IPv6 - see ICMP).

Selling points

Uptake can be low (but not zero) and significant benefits are still seen. Even if just two ISPs take up the scheme, one access and one hosting, everyone who uses the access ISP is now anonymous for any of their traffic that goes to the hosting ISP. Privacy-conscious individuals can take note of that and move to those ISPs, or tunnel their traffic to them, to regain their anonymity.

Faster than Tor. Especially in the latency stakes.

Requires no CPE changes. This killed IPv6 uptake for a decade - end users are not easy to upgrade. L/ISP schemes typically require the holder of the EID to be in charge of looking up and using RLOCs; this scheme does not need that.

Probably stateless. Putting the encrypted EIDs into the packet we send means that the source and destination ISPs don't need to perform connection tracking. This isn't NAT in the traditional sense.

Since both source and destination enjoy a large anonymity set, this scheme is resilient to correlation attacks. An earlier revision only encrypted the source EID, which was vulnerable to trivial attacks of that nature.

ICMP

As noted, a naive implementation breaks ICMP responses by intermediaries. This is a result of the design; as they no longer know the EID, they can't send an ICMP response of any sort to it. As IPv6, in particular, relies on ICMP for protocol features such as path MTU discovery, this is something of a problem.

One solution to this is to have a wide range of RLOC IP addresses, and use them in a round-robin manner to maintain a local map of RLOC -> real source IP, which would be retained for a short period (some seconds, say). If an ICMP reply is received, directed to one of these RLOCs, the EID can be looked up from the map and the packet can be rewritten with it and forwarded appropriately. Of course, this is state for the ISP to keep track of, and monopolises a segment of the IP address space. That last is not a problem in IPv6, but will prevent its use in almost all IPv4 deployments. Fortunately, IPv4 is legacy, and doesn't strictly require ICMP, in the same way IPv6 does.

Why?

It's my position that anonymity is only necessary in the presence of oppression. In the absence of oppression, anonymity primarily facilitates crime and wrong- doing. When present, it continues to do that, but also provides a means of escaping oppression, and defeating oppressors.

Are we in an oppressive society? Do oppressive societies exist? I believe the answer to both of those questions is yes. I wish it were otherwise.

Author

Name   : Nick Thomas
Handle : lupine
Web    : lupine.me.uk
Comms  : nick@lupine.me.uk