反伽罗瓦洋葱：改进的Tor电路流量加密

时间：2025-11-29 来源：torproject 作者：nickm 条评论

Counter Galois Onion: Improved encryption for Tor circuit traffic

It's always a good day when we can talk about cryptography. Especially when we are sunsetting one of the oldest and most important encryption algorithms in Tor and replacing it with a research-backed new design, called Counter Galois Onion.TYj免费翻墙网

This overhaul will defend users against a broader class of online attackers (described below), and form the basis for more encryption work in the future.TYj免费翻墙网

Which cryptography are we talking about here?

The algorithm in question is Tor's relay encryption. While Tor uses the standard TLS protocol for communication between relays, and between clients and relays, it needs a specialized algorithm for encrypting user data as it traverses multiple relays in a circuit.¹TYj免费翻墙网

That's the relay encryption algorithm. The client shares a symmetric key with each relay on its circuit, and encrypts an outgoing message, or "relay cell" with each one of those keys. Each relay can remove a single layer of encryption, until the client's cell reaches the exit relay.TYj免费翻墙网

Of course, we need to make sure that the data isn't modified on the way from the client. For that, we include a cryptographic digest in the cell. The digest covers not only the cell itself, but also all previous cells sent through the circuit (to prevent re-ordering) and another secret shared key (to make the digest unpredictable).TYj免费翻墙网

So with the digest added, clients behave as follows: they calculate and set the digest, then they use a stream cipher (AES-128-CTR in Tor's case) multiple times to encrypt the cell for each relay.TYj免费翻墙网

If we simplify a lot, then a relay cell looks a little like this:TYj免费翻墙网

Field	Width
Zero	2 bytes
Digest	4 bytes
Other stuff	...

The "zero" field is there so that nodes other than the exit can avoid checking the digest: If a relay gets a cell with a value other than zero, then it can tell that it isn't the recipient of that cell, so it doesn't need to check the digest: it just decrypts the cell and forwards it to the next relay.TYj免费翻墙网

When we designed this algorithm, we didn't give it a name. Now that we're replacing it, it helps to have some way to identify it, so we're calling it "tor1".TYj免费翻墙网

A diagram of the Tor1 encryption algorithm — Figure 1. The tor1 encryption algorithm, as used at a middle layer.
The input is a message M. A counter CTR is expanded via a psueodrandom function (PRF, instantiated with AES-128-CTR), to produce a stream of bytes. These bytes are xored with M to produce a ciphertext C.TYj免费翻墙网

Wait, that design looks funny!

Yeah, you wouldn't build it that way nowadays, would you? That's why we're replacing it.TYj免费翻墙网

In today's world of high-quality online courses and excellent introductory material, it's easy to forget the general state of accessible cryptography resources when Tor was getting started. AES was brand new, and authenticated encryption as a separate field of study had just started to emerge.TYj免费翻墙网

Existing designs weren't suitable for Tor's needs. The original onion routing paper didn't specify a means for authentication. Designs like mixmaster and mixminion were optimized for larger message sizes, and required a separate full digest for every possible layer of encryption. (For example, Mixmaster supported up to 20 remailers, so had to reserve space for 20 digests (and other stuff) in every message.)TYj免费翻墙网

Some of the "weird things" in the current tor1 design are outdated, but not awful; others are things that are more valuable to resolve.TYj免费翻墙网

So, what are the problems with the tor1 design?

There are a few. First the big one:TYj免费翻墙网

Problem 1: Tagging attacks

Tagging attacks enable an active adversary to trace traffic by modifying it in one place on the network, and observing predicatable changes in another. Even when tagging attacks don't succeed immediately, their side effects can give the attacker more and more opportunities to retry.TYj免费翻墙网

This is the most important attack we're solving with CGO. Even without the other problems below, this one would be worth fixing on its own.TYj免费翻墙网

The main version of this attack arises because tor1's use of AES-CTR encryption with no hop-by-hop authentication means that the relay encryption is malleable. Since counter mode derives its ciphertext C by XORing a secret key stream S with the plaintext P (C = S ⊕ P), an attacker who can XOR their own pattern M in to the ciphertext will produce C' = (S ⊕ P) ⊕ M = S ⊕ (P ⊕ M) — that is, a valid encryption of (P ⊕ M).TYj免费翻墙网

An attacker can use this attack to ensure that they control both ends of the circuit. They XOR a pattern onto a cell at one end, and then see if any garbled cells at the other end become clear when whey remove that same pattern. Any circuits with an honest endpoint will fail (and not be deanonymized), but the client will retry them until they eventually choose a malicious endpoint.TYj免费翻墙网

If the attacker chooses a known-plaintext portion of the relay cell for their marker (such as the header or slack zero space), then they can use their marker to communicate an identifier across the circuit, by retrieving it at the end:TYj免费翻墙网

 M = (P ⊕ M) ⊕ P.

M can then be used to transmit an IP address or unique identifier for the user.TYj免费翻墙网

In comparison to probabilistic traffic correlation, this attack provides definite results immediately, with a strength multiplier: it also allows the attacker to ensure that all the traffic they successfully carry is fully deanonymized, before the circuit is used for any application traffic at all.TYj免费翻墙网

The downside for the attacker is that the resulting failure rate of circuits can be detected by the client. Currently, Tor clients emit log notices and warnings when circuit failure rates are excessively high. Unfortunately, as vigilant users have noticed, when the DDoS attacks on Tor become severe, these detectors give false alarms.TYj免费翻墙网

This class of attacks (where an adversary is able to abuse the Tor Protocol to transmit information between relays before application activity) is known as Internal Covert Channel attacks. Tor is in the process of updating its threat model to cover these attack vectors explicitly, along with two other categories of attack vectors.TYj免费翻墙网

Problem 2: Forward secrecy begins when a circuit closes

This attack and the one after it are much less severe than the tagging attack above; we mention them for the sake of completeness.TYj免费翻墙网

In many modern online protocols, including messaging apps like Signal, the keys used to decrypt a message are destroyed as soon as the message is decrypted, so that nobody can steal them and use them later on. But Tor's old encryption algorithm (tor1) doesn't provide this property: the same AES keys are used for the entire life of the circuit. That means that if a key was stolen while the circuit was still alive, all previous traffic on the circuit could be decrypted.TYj免费翻墙网

When a circuit's lifetime is on the order of minutes, that's not so bad, but sometimes circuits stay around for days. (What's more, longer-lived circuits may be better for anonymity, especially when the user is maintaining a persistent identity, so it's a good idea to make them stronger.)TYj免费翻墙网

Although this attack is minor in comparison to the tagging issue, we may as well address it while we are updating our encryption.TYj免费翻墙网

Problem 3: A 4-byte authenticator? Seriously?

Yeah, that's not great. The use of a mere 4-byte digest means that there's a one-in-4-billion chance to forge a cell undetected.TYj免费翻墙网

That isn't a very good attack in practice: if the attacker doesn't get lucky with their guess, then their invalid message causes the circuit to fail, and they can't try again unless the client builds another circuit through them. The same pathbias mechanisms that help resist tagging attacks also help here, but it would be better not to need them.TYj免费翻墙网

(Also, it's using SHA-1, which is showing its age, to say the least.²)TYj免费翻墙网

So, how did we think about replacing this in the past?

We've wanted to replace this algorithm a few times, but we've been hung up on issues of design and efficiency.TYj免费翻墙网

We definitely don't want to follow remailer designs by adding one authenticator per layer of encryption: that way lies big overhead. If we tried something like that with onion services, we'd be devoting something like 15% of our bandwidth to authenticator fields.TYj免费翻墙网

One promising design element has been wide-block ciphers: these are ciphers (or modes of using ciphers) that encrypt an entire message as if it were a single opaque block: any change in the ciphertext garbles the entire message as if it were a single block in a regular block cipher.TYj免费翻墙网

(Technically, this needs to be a "strong pseudorandom permutation" (SPRP) if it's going to resist tagging attacks.)TYj免费翻墙网

You can make a wide-block cipher into an authenticated cipher by reserving some portion of the plaintext for a known value -- say, 16 bytes of zeros.TYj免费翻墙网

But most wide-block cipher designs are comparatively expensive. Nearly all of the strong ones (BEARESS, LIONESS, biIGE, HHFHFH) require two full encryptions and two hashes over the data. Newer modes like HCTR2 require only one encryption pass, but still need two hashes. (For comparison: tor1 requires 3 encryptions and one hash for a cell on a 3-hop circuit, whereas one of these designs requires on the order of 6 encryptions and 6 hashes.)TYj免费翻墙网

We're willing to pay some CPU cost for improved cryptography (and we should expect to pay some cost, since authentication doesn't come for free) but we need to keep the cost to a minimum.TYj免费翻墙网

Now, multiple passes are necessary for any wide-block design: it's provable that there's no way to make sure that changing any bit will potentially garble every other bit unless there are at least two passes. But we'd like to make these passes as cheap as possible!TYj免费翻墙网

There has also been excellent work on other wide-block designs built from scratch, rather than from an underlying cipher (notably AEZ).TYj免费翻墙网

What are we going with?

For years now, cryptographers have been looking for good solutions here.TYj免费翻墙网

Jean Paul Degabriele, Alessandro Melloni, Jean-Pierre Münch, and Martijn Stam have a design that they're calling Counter Galois Onion (CGO). It's based on a kind of construction called a Rugged Pseudorandom Permutation (RPRP): essentially, it's a design for a wide-block cipher that resists malleability in one direction (for the encrypt operation, but not the decrypt operation). If we deploy this so that clients always decrypt and relays always encrypt, then we have a tagging resistant³ cipher at less cost than a full SPRP!TYj免费翻墙网

Using a RPRP that they call UIV+ (see the paper), the authors achieve all of our goals (tagging resistance, immediate forward secrecy, longer authentication tags, limited bandwidth overhead, relatively efficient operation, and modernized cryptography).TYj免费翻墙网

(Shortly before this blog post went up, they revised their paper and released a new security proof.)TYj免费翻墙网

We've written a specification which matches their paper and their reference implementation.TYj免费翻墙网

How does it work?

CGO makes it so that if anybody tampers with any part of your encrypted data, the entire message, and all future messages, become unrecoverable. Here's how!TYj免费翻墙网

(If you don't like reading cipher diagrams, you may want to skip this section. And if you really like reading them, you should check out the specification and the paper!)TYj免费翻墙网

The figures below present the UIV+ building block, and show how it is used to build CGO encryption.TYj免费翻墙网

Figure 3: UIV+ encryption
The input X is split into two parts: A short X_L, and a longer X_R. X_R, and a "tweak" value H, are themselves passed as tweaks to a tweakable block cipher E_T (instantiated with LRW2), which is then used used to encrypt X_L. The output of this encryption seeds a PRF, which is xored into X_R to encrypt it.TYj免费翻墙网

Figure 4: Middle-layer CGO encryption.
CGO treats every message as a 16-byte tag T, and a 493-byte ciphertext C. These are passed as X_L and X_R to the UIV+ encryption algorithm above. The tweak value (H in UIV+) is here called T': each cell's "T" value, after encryption, is taken as the T' for the next cell.TYj免费翻墙网

Figure 5: Originating a CGO message
When _originating_ the message, CGO initializes its tag as a nonce value N. The value of N, _and the encryption keys_, are all transformed using an "Update" algorithm as the message is encrypted. The new N, and the new encryption keys, will be used to encrypt the next cell.TYj免费翻墙网

Okay, but how does all of this cryptography solve the problems we began with?TYj免费翻墙网

First (and most importantly) tagging attacks are prevented by two factors:TYj免费翻墙网

When encrypting⁴, the message is transformed in a wide-block construction, so that any change to the input renders the entire output unrecoverable.
The chaining of T' and N values means that a message's encryption depends on all previous messages, so if a single message is garbled, all subsequent messages will be unrecoverable.

Second, forward secrecy is achieved with the Update construction in figure 5. Every time a new cell is originated or received, the keys used to originate or receive it are transformed unrecoverably, so that the encryptor/decryptor no longer holds the keys necessary to decrypt earlier cells.TYj免费翻墙网

Third, the truncated digest is now replaced with a nice long 16-byte authenticator, like sensible people use.TYj免费翻墙网

Aside: What if we're wrong?

CGO is a fairly new design, and it's reasonable to ask whether there could be weaknesses in it that would make it worse than it's designed to be. I'd answer: There might be! Attacks only get better with time, and although the cryptographers behind CGO are skilled and well regarded, even the best cryptographers can make mistakes. There is a security proof, but it's fairly recent, and and it hasn't yet gotten intensive scrutiny. With time, as CGO gets attention from more cryptographers, we'll (hopefully) gain more confidence in its strength. (And if we do decide that we need to replace it, the work we've done to add it to Arti and the C Tor implementation will make it much easier to do a later migration to a different system later on.)TYj免费翻墙网

But what we are pretty sure about is that there aren't likely to be any weaknesses in CGO that would make it worse than tor1.TYj免费翻墙网

Where does our implementation stand?

It's underway!TYj免费翻墙网

We've implemented the cryptography for Arti, the Rust Tor implementation. We've also implemented it in C, since it won't do us any good unless relays support it too, and the Arti relay project is still a work in progress.TYj免费翻墙网

In order to build this implementation, we've had to refactor a lot of code to revise its existing assumptions: for example, we've had to revise all the places where we assumed anything about the layout of a relay cell, or where we assumed that there was only one way to do relay encryption. These changes will help us with any other changes to relay cell formatting and encryption in the future.TYj免费翻墙网

Our next steps are:TYj免费翻墙网

Enable CGO by default in Arti. (It's currently marked as experimental because of some of its dependencies.)TYj免费翻墙网
Implement CGO negotiation for onion services. (This feature is likely to be be Arti-only, due to its complexity.)TYj免费翻墙网
Tune the performance for modern CPUs. (The CGO authors got impressively good results for their optimized implementation, but some of the tricks they used will be pretty hard to deliver in the C Tor implementation without big refactoring. Fortunately, there are some low-hanging fruit in optimizing what we have today.)TYj免费翻墙网

Thanks for your help!

Thanks to all the cryptographers, programmers, researchers, and cypherpunks who have contributed to this work over the years. Thanks also to all the researchers who have advanced this work, and the state of modern cryptography; we all stand on the work they have built.TYj免费翻墙网

And thanks to Mike Perry for helping me write this and get the parts of the threat model right.TYj免费翻墙网

And finally, thanks to everybody who has donated to Tor over the years! A lot of this work is done through a grant from the Bureau of Democracy, Human Rights, and Labor, but many critical parts of this work (including years of past groundwork, and all the parts related to onion services) have been paid for out of our unrestricted funds, which rely on donors like you. Thanks for believing in our mission, and thanks for helping to make Tor better! <3TYj免费翻墙网

Why not just use multiple layers of nested TLS? First, because each layer of TLS adds bandwidth overhead, and that overhead adds up fast. Second, because TLS implementations aren't aiming for anonymity, they tend to do things like generate records of different sizes, making it hard to cause traffic to appear uniform. Third, because the TLS protocol exists in many different flavors that client implementations are often distinguishable from one another: we would need to take special care to keep every client on the same TLS implementation, with locked-down flags and options. That would make upgrading and portability highly difficult.↩TYj免费翻墙网
Yeah, the tor1 algorithm uses SHA-1. It was 2002, and OpenSSL didn't have SHA-256 support yet. This isn't as bad as it looks, though: we don't actually need collision resistence here, and the attacker doesn't actually get to see the digest under relevant circumstances, so the regular weaknesses of SHA-1 don't apply in full force.TYj免费翻墙网

TYj免费翻墙网

Nonetheless: we will be quite glad to say goodbye to SHA-1 once tor1 is finally replaced.↩TYj免费翻墙网
Technically, the client can tag their own traffic, but they wouldn't gain anything by doing so.↩TYj免费翻墙网
Remember the RPRP property: encryption resists malleability, but decryption is malleable.↩TYj免费翻墙网